Reference

newt.db module-level functions

newt.db.connection(dsn, **kw)

Create a newt newt.db.Connection.

Keyword options can be used to provide either ZODB.DB options or RelStorage options.

newt.db.DB(dsn, **kw)

Create a Newt DB database object.

Keyword options can be used to provide either ZODB.DB options or RelStorage options.

A Newt DB object is a thin wrapper around ZODB.DB objects. When it’s open method is called, it returns newt.db.Connection objects.

newt.db.storage(dsn, keep_history=False, transform=None, **kw)

Create a RelStorage storage using the newt PostgresQL adapter.

Keyword options can be used to provide either ZODB.DB options or RelStorage options.

newt.db.pg_connection(dsn, driver_name='auto')

Create a PostgreSQL (not newt) database connection

This function should be used rather than, for example, calling psycopg2.connect, because it can use other Postgres drivers depending on the Python environment and available modules.

class newt.db.Connection(connection)

Wrapper for ZODB.Connection.Connection objects

newt.db.Connection objects provide extra helper methods for searching and for transaction management.

abort()

Abort the current transaction

commit()

Commit the current transaction

create_text_index(fname, D=None, C=None, B=None, A=None, config=None)

Set up a newt full-text index.

The create_text_index_sql method is used to compute SQL, which is then executed to set up the index. (This can take a long time on an existing database with many records.)

The SQL is executed against the database associated with the given connection, but a separate connection is used, so it’s execution is independent of the current transaction.

static create_text_index_sql(fname, D=None, C=None, B=None, A=None)

Compute and return SQL to set up a newt text index.

The resulting SQL contains a statement to create a PL/pgSQL function and an index-creation function that uses it.

The first argument is the name of the function to be generated. The second argument is a single expression or property name or a sequence of expressions or property names. If expressions are given, they will be evaluated against the newt JSON state column. Values consisting of alphanumeric characters (including underscores) are threaded as names, and other values are treated as expressions.

Additional arguments, C, B, and A can be used to supply expressions and/or names for text to be extracted with different weights for ranking. See: https://www.postgresql.org/docs/current/static/textsearch-controls.html#TEXTSEARCH-RANKING

The config argument may be used to specify which text search configuration to use. If not specified, the server-configured default configuration is used.

query_data(query, *args, **kw)

Query the newt Postgres database for raw data.

Query parameters may be provided as either positional arguments or keyword arguments. They are inserted into the query where there are placeholders of the form: %s for positional arguments, or %(NAME)s for keyword arguments.

A sequence of data tuples is returned.

search(query, *args, **kw)

Search for newt objects using an SQL query.

Query parameters may be provided as either positional arguments or keyword arguments. They are inserted into the query where there are placeholders of the form %s for positional arguments or %(NAME)s for keyword arguments.

The query results must contain the columns zoid and ghost_pickle. It’s simplest and costs nothing to simply select all columns (using *) from the newt table.

A sequence of newt objects is returned.

search_batch(query, args, batch_start, batch_size=None)

Query for a batch of newt objects.

Query parameters are provided using the args argument, which may be a tuple or a dictionary. They are inserted into the query where there are placeholders of the form %s for an arguments tuple or %(NAME)s for an arguments dict.

The batch_size and batch_size arguments are used to specify the result batch. An ORDER BY clause should be used to order results.

The total result count and sequence of batch result objects are returned.

The query parameters, args, may be omitted. (In this case, batch_size will be None and the other arguments will be re-arranged appropriately. batch_size is required.) You might use this feature if you pre-inserted data using a database cursor mogrify method.

where(query_tail, *args, **kw)

Query for objects satisfying criteria.

This is a convenience wrapper for the search method. The first arument is SQL text for query criteria to be included in an SQL where clause.

This mehod simply appends it’s first argument to:

select * from newt where

and so may also contain code that can be included after a where clause, such as an ORDER BY clause.

Query parameters may be provided as either positional arguments or keyword arguments. They are inserted into the query where there are placeholders of the form: %s for positional arguments, or %(NAME)s for keyword arguments.

A sequence of newt objects is returned.

where_batch(query_tail, args, batch_start, batch_size=None)

Query for batch of objects satisfying criteria

Like the where method, this is a convenience wrapper for the search_batch method.

Query parameters are provided using the second, args argument, which may be a tuple or a dictionary. They are inserted into the query where there are placeholders of the form %s for an arguments tuple or %(NAME)s for an arguments dict.

The batch_size and batch_size arguments are used to specify the result batch. An ORDER BY clause should be used to order results.

The total result count and sequence of batch result objects are returned.

The query parameters, args, may be omitted. (In this case, batch_size will be None and the other arguments will be re-arranged appropriately. batch_size is required.) You might use this feature if you pre-inserted data using a database cursor mogrify method.

newt.db.search module-level functions

Search API.

It’s assumed that the API is used with an object stored in a RelStorage with a Postgres back end.

newt.db.search.where(conn, query_tail, *args, **kw)

Query for objects satisfying criteria.

This is a convenience wrapper for the search method. The first arument is SQL text for query criteria to be included in an SQL where clause.

This mehod simply appends it’s first argument to:

select * from newt where

and so may also contain code that can be included after a where clause, such as an ORDER BY clause.

Query parameters may be provided as either positional arguments or keyword arguments. They are inserted into the query where there are placeholders of the form: %s for positional arguments, or %(NAME)s for keyword arguments.

A sequence of newt objects is returned.

newt.db.search.search(conn, query, *args, **kw)

Search for newt objects using an SQL query.

Query parameters may be provided as either positional arguments or keyword arguments. They are inserted into the query where there are placeholders of the form %s for positional arguments or %(NAME)s for keyword arguments.

The query results must contain the columns zoid and ghost_pickle. It’s simplest and costs nothing to simply select all columns (using *) from the newt table.

A sequence of newt objects is returned.

newt.db.search.where_batch(conn, query_tail, args, batch_start, batch_size=None)

Query for batch of objects satisfying criteria

Like the where method, this is a convenience wrapper for the search_batch method.

Query parameters are provided using the second, args argument, which may be a tuple or a dictionary. They are inserted into the query where there are placeholders of the form %s for an arguments tuple or %(NAME)s for an arguments dict.

The batch_size and batch_size arguments are used to specify the result batch. An ORDER BY clause should be used to order results.

The total result count and sequence of batch result objects are returned.

The query parameters, args, may be omitted. (In this case, batch_size will be None and the other arguments will be re-arranged appropriately. batch_size is required.) You might use this feature if you pre-inserted data using a database cursor mogrify method.

newt.db.search.search_batch(conn, query, args, batch_start, batch_size=None)

Query for a batch of newt objects.

Query parameters are provided using the args argument, which may be a tuple or a dictionary. They are inserted into the query where there are placeholders of the form %s for an arguments tuple or %(NAME)s for an arguments dict.

The batch_size and batch_size arguments are used to specify the result batch. An ORDER BY clause should be used to order results.

The total result count and sequence of batch result objects are returned.

The query parameters, args, may be omitted. (In this case, batch_size will be None and the other arguments will be re-arranged appropriately. batch_size is required.) You might use this feature if you pre-inserted data using a database cursor mogrify method.

newt.db.search.query_data(conn, query, *args, **kw)

Query the newt Postgres database for raw data.

Query parameters may be provided as either positional arguments or keyword arguments. They are inserted into the query where there are placeholders of the form: %s for positional arguments, or %(NAME)s for keyword arguments.

A sequence of data tuples is returned.

newt.db.search.create_text_index_sql(fname, D=None, C=None, B=None, A=None, config=None)

Compute and return SQL to set up a newt text index.

The resulting SQL contains a statement to create a PL/pgSQL function and an index-creation function that uses it.

The first argument is the name of the function to be generated. The second argument is a single expression or property name or a sequence of expressions or property names. If expressions are given, they will be evaluated against the newt JSON state column. Values consisting of alphanumeric characters (including underscores) are threaded as names, and other values are treated as expressions.

Additional arguments, C, B, and A can be used to supply expressions and/or names for text to be extracted with different weights for ranking. See: https://www.postgresql.org/docs/current/static/textsearch-controls.html#TEXTSEARCH-RANKING

The config argument may be used to specify which text search configuration to use. If not specified, the server-configured default configuration is used.

newt.db.search.create_text_index(conn, fname, D, C=None, B=None, A=None, config=None)

Set up a newt full-text index.

The create_text_index_sql method is used to compute SQL, which is then executed to set up the index. (This can take a long time on an existing database with many records.)

The SQL is executed against the database associated with the given connection, but a separate connection is used, so it’s execution is independent of the current transaction.

newt.db.search.read_only_cursor(conn)

Get a database cursor for reading.

The returned cursor can be used to make PostgreSQL queries and to perform safe SQL generation using the cursor’s mogrify method.

The caller must close the returned cursor after use.

newt.db.follow module-level functions

newt.db.follow.updates(conn, start_tid=-1, end_tid=None, batch_limit=100000, internal_batch_size=100, poll_timeout=300)

Create a data-update iterator

The iterator returns an iterator of batchs, where each batch is an iterator of records. Each record is a triple consisting of an integer transaction id, integer object id and data. A sample use:

>>> import newt.db
>>> import newt.db.follow
>>> connection = newt.db.pg_connection('')
>>> for batch in newt.db.follow.updates(connection):
...     for tid, zoid, data in batch:
...         print(tid, zoid, len(data))

If no end_tid is provided, the iterator will iterate until interrupted.

Parameters:

conn
A Postgres database connection.
start_tid
Start tid, expressed as an integer. The iterator starts at the first transaction after this tid.
end_tid
End tid, expressed as an integer. The iterator stops at this, or at the end of data, whichever is less. If the end tid is None, the iterator will run indefinately, returning new data as they are committed.
batch_limit
A soft batch size limit. When a batch reaches this limit, it will end at the next transaction boundary. The purpose of this limit is to limit read-transaction size.
internal_batch_size
The size of the internal Postgres iterator. Data aren’t loaded from Postgres all at once. Server-side cursors are used and data are loaded from the server in internal_batch_size batches.
poll_timeout
When no end_tid is specified, this specifies how often to poll for changes. Note that a trigger is created and used to notify the iterator of changes, so changes ne detected quickly. The poll timeout is just a backstop.
newt.db.follow.get_progress_tid(connection, id)

Get the current progress for a follow client.

Return the last saved integer transaction id for the client, or -1, if one hasn’t been saved before.

A follow client often updates some other data based on the data returned from updates. It may stop and restart later. To do this, it will call set_progress_tid to save its progress and later call get_progress_tid to find where it left off. It can then pass the returned tid as start_tid to updates.

The connection argument must be a PostgreSQL connection string or connection.

The id parameters is used to identify which progress is wanted. This should uniquely identify the client and generally a dotted name (__name__) of the client module is used. This allows multiple clients to have their progress tracked.

newt.db.follow.set_progress_tid(connection, id, tid)

Set the current progress for a follow client.

See get_progress_tid.

The connection argument must be a PostgreSQL connection string or connection.

The id argument is a string identifying a client. It should generally be a dotted name (usually __name__) of the client module. It must uniquely identify the client.

The tid argument is the most recently processed transaction id as an int.

newt.db.follow.listen(dsn, timeout_on_start=False, poll_timeout=300)

Listen for newt database updates.

Returns an iterator that returns integer transaction ids or None values.

The purpose of this method is to determine if there are updates. If transactions are committed very quickly, then not all of them will be returned by the iterator.

None values indicate that poll_interval seconds have passed since the last update.

Parameters:

dsn
A Postgres connection string
timeout_on_start

Force None to be returned immediately after listening for notifications.

This is useful in some special cases to avoid having to time out waiting for changes that happened before the iterator began listening.

poll_timeout
A timeout after which None is returned if there are no changes. (This is a backstop to PostgreSQL’s notification system.)

newt.db.jsonpickle module-level functions

Convert pickles to JSON

The goal of the conversion is to produce JSON that is useful for indexing, querying and reporting in external systems like Postgres and Elasticsearch.

class newt.db.jsonpickle.Jsonifier(skip_class=None, transform=None)
__call__(id, data)

Convert data from a ZODB data record to data used by newt.

The data returned is a class name, ghost pickle, and state triple. The state is a JSON-formatted string. The ghost pickle is a binary string that can be used to create a ZODB ghost object.

If there is an error converting data, if the data is empty, or if the skip_class function returns a true value, then (None, None, None) is returned.

Parameters:

id
A data identifier (e.g. an object id) used when logging errors.
data
Pickle data to be converted.
__init__(skip_class=None, transform=None)

Create a callable for converting database data to Newt JSON

Parameters:

skip_class
A callable that will be called with the class name extracted from the data. If the callable returns a true value, then data won’t be converted to JSON and (None, None, None) are returned. The default, which is available as the skip_class attribute of the Jsonifier class, skips objects from the BTrees package and blobs.
transform

A function that transforms a record’s state JSON.

If provided, it should accept a class name and a state string in JSON format.

If the transform function should return a new state string or None. If None is returned, the original state is used.

If the function returns an empty string, then the Jsonifier will return (None, None, None). In other words, providing a transform that returns an empty string is equivalent to providing a skip_class function that returns True.

Returning anything other than None or a string is an error and behavior is undefined.

class newt.db.jsonpickle.JsonUnpickler(pickle)

Unpickler that returns JSON

Usage:

>>> apickle = pickle.dumps([1,2])
>>> unpickler = JsonUnpickler(apickle)
>>> json_string = unpickler.load()
>>> unpickler.pos == len(apickle)
True