[Pipet Devel] Paos README

Thu Mar 4 04:32:33 EST 1999

For more information on Paos, attached in the README file, in English.

Jeff
-- 
J.W. Bizzaro                  Phone: 617-552-3905
Boston College                mailto:bizzaro at bc.edu
Department of Chemistry       http://www.uml.edu/Dept/Chem/Bizzaro/
--
-------------- next part --------------
#
# Copyright 1995 Carlos Maltzahn
#
# Permission to use, copy, modify, distribute, and sell this software
# and its documentation for any purpose is hereby granted without fee,
# provided that the above copyright notice appear in all copies and that
# both that copyright notice and this permission notice appear in
# supporting documentation, and that the name of Carlos Maltzahn or
# the University of Colorado not be used in advertising or publicity
# pertaining to distribution of the software without specific, written
# prior permission.  Carlos Maltzahn makes no representations about the
# suitability of this software for any purpose.  It is provided "as is"
# without express or implied warranty.
#
# CARLOS MALTZAHN AND THE UNIVERSITY OF COLORADO DISCLAIMS ALL WARRANTIES
# WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF
# MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL THE UNIVERSITY OF COLORADO
# BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY
# DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
# IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING
# OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
#
# Author:
#       Carlos Maltzahn
#       Dept. of Computer Science
#       Campus Box 430
#       Univ. of Colorado, Boulder
#       Boulder, CO 80309
#
#       carlosm at cs.colorado.edu
#

                               Paos
                               ====

DISTRIBUTION
------------
Paos (Python active object server) is an active multi-user object server with a
simple query language. All software is written in Python. The distribution
consists of the following files:

  Store.py      - implements storing and locking of objects, the query language
                  and registration of notifications.

  Server.py     - implements the network interface of Store.py. Server.py
                  imports Store.py and is started by "python Server.py <port>"

  Client.py     - implements the network interface of a client. It is used
                  by importing it into a Python program.

  Schema.py     - defines the class DBobject. All objects that are to be
                  stored in the object server need to be of this class or
                  a class that inherits this class directly or indirectly.

  Utilities.py  - contains a number of functions that are used in
                  above modules.

  example/
  --------
  Producer.py   - implements a producer that accepts input lines and stores them
                  to the object server. Started by
                  "python Producer.py <host> <port>"

  Consumer.py   - implements a consumer that prints out lines produced by
                  a producer and is started by
                  "python Consumer.py <host> <port>"

  Talk.py       - implements two way communication (accepts input lines and
                  prints out lines received from the server as notifications).
                  Uses select call and the new pipe feature.

  ExSchema.py   - contains the schema necessary for Talk.py, Producer.py and
                  Consumer.py

INSTALLATION
------------
Look at http://www.python.org/ for information on how to get and
install Python.

During installation make sure that you include at least one database
module of either dbhash, gdbm, dbm, or macdb. I would recommend dbhash
or dbm with the ndbm library because these do not limit length of records
(which gdbm and the default library of dbm do; I don't know anything about
macdb).

Second you need to include the home directory of Paos and all your
application directories into the environment variable PYTHONPATH. In
this case the applicaton directory would be <Paos home>/example.
Sometimes this environment variable is not accessible to the Python
application (e.g. in CGI programs for a WWW server). Then your
application programs need to import the module "sys" and set the
variable "sys.path" appropriately.

STARTING THE SERVER
-------------------
You start the server by "python Server.py <port number>
[<database file name>]. The database file name is optional. The
default database file name is "database". The server then looks for a
file <database file name>.db. If it it does not exist, the server
creates a new file of this name.

CONNECTING TO THE SERVER
------------------------
The client can be either a standalone or an embedded Python program.
It needs to import Client.py. This module defines a class called
"Connection" which is instantiated as follows:

import Client

conn = Client.Connection(<host name>,
                         <port number>,
                         <client name> [,
                         <callback function>])

If host and port are correctly specified this creates a TCP connection to the
server.

<client name> can be an arbitrary string which is only useful
for debugging purposes and possible future extensions.

<callback function> is optional. If specified, this function is called
if the client receives a notification from the object server (see below on
how to register notification requests).

NEW in v0.2: Instead of the callback function you can now pass a pipe
instead of a callback function (a tuple of a read and write file
descriptor returned by os.pipe()). You can use select.select(...)
on the read descriptor of the pipe. Use Utilities.READ(...) and
pickle.loads(...) to receive the notification (see below for the format
of a notification). You also need to apply conn.register_objs(...) on
the notification's object list. See the example application.

All interactions with the server are defined as methods of the
Connection instance. Note also, that you could have multiple connections
to same or different servers. However, currently each object server has
a seperate object ID name space. Also, each client registrates with a
client specific name, not a connection specific name. Therefore, the
client programmer has to take care of possible name collisions. A future
version will introduce client naming that is unique over all connections
and object ID naming that is unique over all Paos object servers.

Use

conn.close()

to close the connection.

QUERYING THE OBJECT SERVER
--------------------------
In order to query the object server you use

answer = conn.get(<access mode>, <scope>, <property list>)

answer is a list of objects.

<access mode> can be either 'r' for read-only access or 'rw'
  for write-locking all objects contained in the answer. If some of the
  objects contained in answer are already write-locked by another client
  then the answer is None. Note the difference to an empty list that
  merely indicates that there is no object in the object server that
  matches the query. Note that each failure to acquire write-locks results
  in the loss of all write-locks acquired so far!

<scope> can be either a list of persistent object references or a class name.
  A persistent object reference is a tuple as follows:
  ('__db', <db_id>).

<db_id> is an integer issued to each object that is stored in the object server.

<property list> is a list of properties. A property is a tuple as follows:
  (<attribute name>, <relation>, <value>).

<attribute name> is a string specifying the name of an attribute of objects
  specified by <scope>.

<relation> can have '==', '!=', 'in', 'not in', 'has', 'has not',
  'all in', 'not all in', 'some in', 'none in'.

  The meaning of '==', ..., 'not in' is the same as in Python.

  A list 'has' element iff element 'in' a list.

  A list 'has not' element iff not list 'has' element.

  List A 'all in' list B iff the elements of A are a subset of elements of B.

  List A 'not all in' list B iff not list A 'all in' list B

  List A 'some in' list B iff there exist a non-empty subset C of elements of A
    which is also a subset of elements of B.

  List A 'none in' list B iff not list A 'some in' list B

  Note that 'some in' is not the same as 'not all in'. In the first case
  the subset C has to be non-empty; in the second case C can be empty.

CREATING NEW OBJECTS
--------------------

Each new object that is created in a client and that is eventually
written to the object server needs to be registered with the server
PRIOR TO COMMIT TIME. Objects that are not registered at commit time can
cause bad inconsistencies! In general new objects should be registered
before your first access to one of their attributes with references
to other persistent objects. Each registered object receives a unique
persistent object ID under the attribute name "db_id". Use

db_id_list = conn.register_objs(<obj_list>)

db_id_list is a list of db_id integers in the order corresponding to <obj_list>.

<obj_list> is a list of objects. It can contain registered and unregistered
  objects. Registrating registered objects is useful in connection with
  notifications (see below). All unregistered objects in <obj_list>
  acquire write-locks.

STORING OBJECTS
---------------
Objects are stored by using

ret = conn.commit(<obj_list>)

ret is either 'ok' or None if an error at the server occured
  (the diagnostics printed out by the server will give more information
  about the error - I'm aware that this is not a good solution; future
  versions will hopefully offer a better error handling).

<obj_list> is a list of objects. <obj_list> contains all the objects
  that are supposed to be written to  the database. However, only objects
  that were previously locked will be written to the object server; readonly
  objects are simply ignored.

LOCKING OBJECTS
---------------
It is possible to write-lock objects once they are loaded. Use

answer = conn.lock(<obj_list>)

answer is a list of objects locked. The order of the list corresponds to
  <obj_list>. However, answer contains the versions of objects
  as they were found in the object server at locking time. If the lock
  failed answer is None and all previously acquired locks are released.

<obj_list> is list of persistent objects to be locked. Objects that are not
  explicitly mentioned in the list (i.e., are only directly or indirectly
  referenced by objects explicitly mentioned in the list) are ignored.

Note: 'lock' is faster than 'get' in the case of failed locking: 'get'
retrieves objects before checking their locks while 'lock' checks locks first.

Note also that there are three occasions where all previously acquired
locks are lost: (1) calling "commit", (2) calling "lock" which fails, and
(3) closing the connection or terminating the client

ATTRIBUTE ACCESS
----------------
Assuming you load object a and b, and a.attr = b, i.e. a.attr contains a
pointer to b. Now you issue a query that loads b and c. However, a.attr
and b refer now to different objects because a.attr points to an older
version of b. With many objects referring to each other it can become
quite difficult to keep track of all the different versions of objects.

In Paos each connection instance maintains an object cache that is
updated by all connection methods except get_raw_notification() (see
below). Attribute access of registered objects always access objects
in the cache. Thus, in the above example a.attr always refers to the
newest version of b. If a user wants to keep the older version of b
she needs to assign it to a variable v before the next query. However,
b's references to other persistent objects always point to the newest
versions.

Another advantage of this policy of attribute access is that the client
will load objects from the object server as needed. For example, if
you load object a and you assign v = a.attr then the client will
automatically load b unless it is already in the cache.

This convenience comes with a price: When you define persistent object
classes you need to enumerate those attribute names that can have
attribute values which contain references to other persistent objects.
This information is kept in a special attribute called '__refs'. For
example:

import Schema

class A(schema.DBobject):
  def __init__(self):
    schema.DBObject.__init__(self)
    self.__refs = ['attr']

This assumes that instances of class A have an attribute called 'attr' that
can refer to other persistent objects.

NOTIFICATIONS
-------------
With

request_id = conn.register(<scope>, <property list>)

you can register a notification request. <scope> and <property list> have
the same meaning as in "get". A notification request is a query that is
stored at the object server and evaluated in each subsequent "commit"
against the set of objects that is written to the object server. If the
result of such a query is not empty the client which registered the
notification request is notified. The format of the notification is

(<request_id>, <obj_list>, <committing client>)

<request_id> corresponds with the returned value of the corresponding
  "register" call, i.e. identifies the corresponding query.

<obj_list> is the list of objects that matches the query.

<committing client> identifies the client that triggered the notification.

Note that there no client can register a notification request for
another client; each notification request corresponds to exactly one
client. Also note that notification request do not survive a client's
lifetime: If a client terminates (or crashes) all notification requests
owned by that client are deleted.

There are multiple ways for a client to process notifications. If the
connection to the server was created with a pointer to a callback
function in the fourth argument then the client is interrupted at each
notification (with the signal SIGUSR1) and the callback function is
called. Otherwise the client needs to poll for notifications. In both
cases notifications are retrieved by

notification = conn.get_notification()

Note that a notification is generated for each registered notification
request. For example, if a client registered two requests and a
subsequent commit contains objects matching both requests then the
object server sends two notifications to the client. Also note that
multiple notifications triggered by one commit are sent in the order
they were registered.

Each "get_notification" updates the object cache (see paragraph about
attribute access). One can avoid this by using

notification = conn.get_raw_notification()

Note however, that attribute access in objects within the notification
is not resolved correctly since these objects are disconnected from the
attribute resolution mechanism discussed above. To connect these objects
to the resolution mechanism use "register_objs" (this updates the object
cache).

If there are no notifications "get_notification" returns None.
With

conn.unregister(<request_id>)

you can retract a notification request.