From version 2.0.0, turbodbc adapts semantic versioning.
- Fixed a bug that lead to
handle limit exceeded
error messages whenCursor
objects were not closed manually. With this fix, cursors are garbage collected as expected.
- Added an option to
fetchallarrow()
that fetches integer columns in the smallest possible integer type the retrieved values fit in. While this reduces the memory footprint of the resulting table, the schema of the table is now dependent on the data it contains. - Updated Apache Arrow support to work with version 0.8.x
- Fixed a memory leak on
fetchallarrow()
that increased the reference count of the returned table by one too much.
- Added support for Apache Arrow
pyarrow.Table
objects as the input forexecutemanycolumns()
. In addition to direct Arrow support, this should also help with more graceful handling of Pandas DataFrames aspa.Table.from_pandas(...)
handles additional corner cases of Pandas data structures. Big thanks to @xhochy!
- Added an option to
fetchallarrow()
that enables the fetching of string columns as dictionary-encoded string columns. In most cases, this increases performance and reduces RAM usage. Arrow columns of typedictionary[string]
will result inpandas.Categorical
columns on conversion. - Updated pybind11 dependency to version 2.2+
- Fixed a symbol visibility issue when building Arrow unit tests on systems that hide symbols by default.
- Added new keyword argument
large_decimals_as_64_bit_types
tomake_options()
. If set toTrue
, decimals with more than18
digits will be retrieved as 64 bit integers or floats as appropriate. The default retains the previous behavior of returning strings. - Added support for
datetime64[ns]
data type forexecutemanycolumns()
. This is particularly helpful when dealing with pandasDataFrame
objects, since this is the type that contains time stamps. - Added the keyword argument
limit_varchar_results_to_max
tomake_options()
. This allows to truncateVARCHAR(n)
fields tovarchar_max_character_limit
characters, see the next item. - Added possibility to enforce NumPy and Apache Arrow requirements using extra requirements
during installation:
pip install turbodbc[arrow,numpy]
- Updated Apache Arrow support to work with version 0.6.x
- Fixed an issue with retrieving result sets with
VARCHAR(max)
fields and similar types. The size of the buffer allocated for such fields can be controlled with thevarchar_max_character_limit
option tomake_options()
. - Fixed an issue with some versions of Boost
that lead to problems with
datetime64[us]
columns withexecutemanycolumns()
. An overflow when converting microseconds since 1970 to a database-readable timestamp could happen, badly garbling the timestamps in the process. The issue was surfaced with Debian 7's Boost version (1.49), although the Boost issue was allegedly fixed with version 1.43. - Fixed an issue that lead to undefined behavior when character sequences could not be decoded into Unicode code points. The new (and defined) behavior is to ignore the offending character sequences completely.
- Added new method
cursor.executemanycolumns()
that accepts parameters in columnar fashion as a list of NumPy (masked) arrays. - CMake build now supports
conda
environments - CMake build offers
DISABLE_CXX11_ABI
option to fix linking issues withpyarrow
on systems with the new C++11 compliant ABI enabled
- Initial support for the arrow data format with the
Cursor.fetchallarrow()
method. Still in alpha stage, mileage may vary (Windows not yet supported, UTF-16 unicode not yet supported). Big thanks to @xhochy! prefer_unicode
option now also affects column name rendering when gathering results from the database. This effectively enables support for Unicode column names for some databases.- Added module version number
turbodbc.__version__
- Removed deprecated performance options for
connect()
. Useconnect(..., turbodbc_options=make_options(...))
instead.
The following versions do not conform to semantic versioning. The
meaning of the major.minor.revision
versions is:
- Major: psychological ;-)
- Minor: If incremented, this indicates a breaking change
- Revision: If incremented, indicates non-breaking change (either feature or bug fix)
- Added
autocommit
as a keyword argument tomake_options()
. As the name suggests, this allows you to enable automaticCOMMIT
operations after each operation. It also improves compatibility with databases that do not support transactions. - Added
autocommit
property toConnection
class that allows switching autocommit mode after the connection was created. - Fixed bug with
cursor.rowcount
not being reset to-1
when calls toexecute()
orexecutemany()
raised exceptions. - Fixed bug with
cursor.rowcount
not showing the correct value when manipulating queries were used without placeholders, i.e., with parameters baked into the query. - Global interpreter lock (GIL) is released during some operations to facilitate basic multi-threading (thanks @chmp)
- Internal: The return code
SQL_SUCCESS_WITH_INFO
is now treated as a success instead of an error when allocating environment, connection, and statement handles. This may improve compatibility with some databases.
- Windows is now _officially_ supported (64 bit, Python 3.5 and 3.6). From now on, code is automatically compiled and tested on Linux, OSX, and Windows (thanks @TWAC for support). Windows binary wheels are uploaded to pypi.
- Added supported for fetching results in batches of NumPy objects with
cursor.fetchnumpybatches()
(thanks @yaxxie) - MSSQL is now part of the Windows test suite (thanks @TWAC)
connect()
now allows to specify aconnection_string
instead of individual arguments that are then compiles into a connection string (thanks @TWAC).
- Added support for databases that require Unicode data to be transported in UCS-2/UCS-16 format rather than UTF-8, e.g., MSSQL.
- Added _experimental_ support for Windows source distribution builds. Windows builds are not fully (or automatically) tested yet, and still require significant effort on the user side to compile (thanks @TWAC for this initial version)
- Added new
cursor.fetchnumpybatches()
method which returns a generator to iterate over result sets in batch sizes as defined by buffer size or rowcount (thanks @yaxxie) - Added
make_options()
function that take all performance and compatibility settings as keyword arguments. - Deprecated all performance options (
read_buffer_size
,use_async_io
, andparameter_sets_to_buffer
) forconnect()
. Please move these keyword arguments tomake_options()
. Then, setconnect{}
's new keyword argumentturbodbc_options
to the result ofmake_options()
. This effectively separates performance options from options passed to the ODBC connection string. - Removed deprecated option
rows_to_buffer
fromturbodbc.connect()
(see version 0.4.1 for details). - The order of arguments for
turbodbc.connect()
has changed; this may affect you if you have not used keyword arguments. - The behavior of
cursor.fetchallnumpy()
has changed a little. Themask
attribute of a generatednumpy.MaskedArray
instance is shortened toFalse
from the previous[False, ..., False]
if the mask isFalse
for all entries. This can cause problems when you access individual indices of the mask. - Updated
pybind11
requirement to at least2.1.0
. - Internal: Some types have changed to accomodate for Linux/OSX/Windows compatibility.
In particular, a few
long
types were converted tointptr_t
andint64_t
where appropriate. In particular, this affects thefield
type that may be used by C++ end users (so they exist).
- Internal: Remove some
const
pointers to resolve some compile issues with xcode 6.4 (thanks @xhochy)
- Added possibility to set unixodbc include and library directories in setup.py. Required for conda builds.
- Improved compatibility with ODBC drivers (e.g. FreeTDS) that do not
support ODBC's
SQLDescribeParam()
function by using a default parameter type. - Used a default parameter type when the ODBC driver cannot determine
a parameter's type, for example when using column expressions for
INSERT
statements. - Improved compatibility with some ODBC drivers (e.g. Microsoft's official MSSQL ODBC driver) for setting timestamps with fractional seconds.
- Added support for chaining operations to
Cursor.execute()
andCursor.executemany()
. This allows one-liners such ascursor.execute("SELECT 42").fetchallnumpy()
. - Right before a database connection is closed, any open transactions are explicitly rolled back. This improves compatibility with ODBC drivers that do not perform automatic rollbacks such as Microsoft's official ODBC driver.
- Improved stability of turbodbc when facing errors while closing connections, statements, and environments. In earlier versions, connection timeouts etc. could have lead to the Python process's termination.
- Source distribution now contains license, readme, and changelog.
- Added support for OSX
- Added support for Python 3. Python 2 is still supported as well. Tested with Python 2.7, 3.4, 3.5, and 3.6.
- Added
six
package as dependency - Turbodbc uses pybind11 instead of Boost.Python to generate its Python bindings. pybind11 is available as a Python package and automatically installed when you install turbodbc. Other boost libraries are still required for other aspects of the code.
- A more modern compiler is required due to the pybind11 dependency. GCC 4.8 will suffice.
- Internal: Move remaining stuff depending on python to turbodbc_python
- Internal: Now requires CMake 2.8.12+ (get it with
pip install cmake
)
- Fixed build issue with older numpy versions, e.g., 1.8 (thanks @xhochy)
- Improved performance of parameter-based operations.
- Internal: Major modifications to the way parameters are handled.
- The size of the input buffers for retrieving result sets can now be set
to a certain amount of memory instead of using a fixed number of rows.
Use the optional
read_buffer_size
parameter ofturbodbc.connect()
and set it to instances of the new top-level classesMegabytes
andRows
(thanks @LukasDistel). - The read buffer size's default value has changed from 1,000 rows to 20 MB.
- The parameter
rows_to_buffer
ofturbodbc.connect()
is _deprecated_. You can set theread_buffer_size
toturbodbc.Rows(1000)
for the same effect, though it is recommended to specify the buffer size in MB. - Internal: Libraries no longer link
libpython.so
for local development (linking is already done by the Python interpreter). This was always the case for the libraries in the packages uploaded to PyPI, so no change was necessary here. - Internal: Some modifications to the structure of the underlying C++ code.
- NumPy support is introduced to turbodbc for retrieving result sets.
Use
cursor.fetchallnumpy
to retrieve a result set as anOrderedDict
ofcolumn_name: column_data
pairs, wherecolumn_data
is a NumPyMaskedArray
of appropriate type. - Internal: Single
turbodbc_intern
library was split up into three libraries to keep NumPy support optional. A few files were moved because of this.
turbodbc now supports asynchronous I/O operations for retrieving result sets. This means that while the main thread is busy converting an already retrieved batch of results to Python objects, another thread fetches an additional batch in the background. This may yield substantial performance improvements in the right circumstances (results are retrieved in roughly the same speed as they are converted to Python objects).
Ansynchronous I/O support is experimental. Enable it with
turbodbc.connect('My data source name', use_async_io=True)
- C++ backend:
turbodbc::column
no longer automatically binds on construction. Callbind()
instead.
- Result set rows are returned as native Python lists instead of a not easily printable custom type.
- Improve performance of Python object conversion while reading result sets. In tests with an Exasol database, performance got about 15% better.
- C++ backend:
turbodbc::cursor
no longer allows direct access to the C++field
type. Instead, please use thecursor
'sget_query()
method, and construct aturbodbc::result_sets::field_result_set
using theget_results()
method.
- Fix issue that only lists were allowed for specifying parameters for queries
- Improve parameter memory consumption when the database reports very large string parameter sizes
- C++ backend: Provides more low-level ways to access the result set
- Fix issue that
dsn
parameter was always present in the connection string even if it was not set by the user's call toconnect()
- Internal: First version to run on Travis.
- Internal: Use pytest instead of unittest for testing
- Internal: Allow for integration tests to run in custom environment
- Internal: Simplify integration test configuration
- Internal: Change C++ test framework to Google Test
- New parameter types supported:
bool
,datetime.date
,datetime.datetime
cursor.rowcount
returns number of affected rows for manipulating queriesConnection
supportsrollback()
- Improved handling of string parameters
Initial release