Skip to content

0.10.1 (2024-02-05)

Features

  • Add support of Incremental Strategies for Kafka connection:
    reader = DBReader(
        connection=Kafka(...),
        source="topic_name",
        hwm=DBReader.AutoDetectHWM(name="some_hwm_name", expression="offset"),
    )
    
    with IncrementalStrategy():
        df = reader.run()
    

This lets you resume reading data from a Kafka topic starting at the last committed offset from your previous run. (#202) - Add has_data, raise_if_no_data methods to DBReader class. (#203) - Updare VMware Greenplum connector from 2.1.4 to 2.3.0. This implies: : * Greenplum 7.x support * Kubernetes support * New read option gpdb.matchDistributionPolicy which allows to match each Spark executor with specific Greenplum segment, avoiding redundant data transfer between Greenplum segments * Allows overriding Greenplum optimizer parameters in read/write operations (#208) - Greenplum.get_packages() method now accepts optional arg package_version which allows to override version of Greenplum connector package. (#208)