Next Gen STINGER Python Interface for Prototyping

Published on Dec 1, 2013

The Next Gen development version of STINGER (https://github.com/robmccoll/stinger) now has Python ctypes interfaces to the STINGER core library and the STINGER net library. These enable the use of STINGER in standalone applications and in the streaming processing pipeline as a data stream, algorithm, or monitor. These libraries can be found in ./src/py.

Why ctypes (and not Cython or a native module)? These interfaces are intended for rapid prototyping, so flexibility and ease maintenance was emphasized over performance. This eliminated a native module. Cython would have added a dependency, whereas ctypes is part of the standard library.

A simple (and pointless) example only using the STINGER data structure:

import sys
import os

# so the imports work
sys.path.append("/path/to/stinger/src/py/")

# for stinger internally
os.environ['STINGER_LIB_PATH'] = "/path/to/stinger/build/lib/"

import stinger.stinger_net as sn
import stinger.stinger_core as sc

s = sc.Stinger()
s.insert_edge_pair('stinger', 'stinger_core', 'sub_library')
s.insert_edge_pair('stinger', 'stinger_net', 'sub_library', weight=2, ts=5)
s.set_vtype('stinger_core', 'library')

print s.edges_of('stinger')
print s.max_active_vtx()

# note that each string in stinger is mapped to an integer,
# vertices start at [0,1,2...STINGER_MAX_LVERTICES), vertex and 
# edge types also have their own mappings,but each starts at 1

# by this point, we have mapped three vertices (0,1,2), so 
# when we insert an edge to 3 directly, it has no name
s.insert_edge_pair(3, 9, 2, 2, 5)

# this however, will create a mapping that happens to get allocated 
# 3, since it is the next available mapping
s.insert_edge_pair('three', 'five')

# as a result, these return overlapping results
print s.edges_of(3)
print s.edges_of('three')

# it is generally best to work primarily with strings or primarily
# with integers

s.save_to_file('testing')

And one using the stinger_net library to stream graph edges:

import sys
import os

sys.path.append("/path/to/stinger/src/py/")
os.environ['STINGER_LIB_PATH'] = "/path/to/stinger/build/lib/"

import stinger.stinger_net as sn
import stinger.stinger_core as sc

s = sn.StingerStream('localhost', 10102)

s.add_insert('stinger', 'stinger_core', 'sub_library')
s.add_insert('stinger', 'stinger_net', 'sub_library', weight=2, ts=5)

s.send_batch()

s.add_delete('stinger', 'stinger_core')

s.send_batch()

And last, an algorithm example (hopefully monitors can be inferred from this example and the code from stinger_net.py, but feel free to ask questions below or submit issues to github):

import os

sys.path.append("/export5/env/projects/stinger_demo/src/py/")
os.environ['STINGER_LIB_PATH'] = "/export5/env/projects/stinger_demo/build/lib/"

import stinger.stinger_net as sn
import stinger.stinger_core as sc

# used to configure the algorithm before connecting.
# note the storage per vertex and data description string. these advertise to
# other algorithms what fields you intend to store per vertex and the types of 
# each field as "types fieldname fieldname ..." where types encodes the types of
# the following fields as f (float), d (double), i (int32_t), l (int64_t), and b (uint8_t)
# storing strings is not supported.
ap = sn.StingerAlgParams()
ap['name'] = 'test_alg'
ap['data_desc'] = 'ddlb average_deg avg_neighbors count is_alive'
ap['data_per_vertex'] = 25 # 2 * sizeof(double) + sizeof(int64) + 1
ap['host'] = 'localhost'
ap['port'] = 10103

alg = sn.StingerAlg(ap)

alg.begin_init()
# do any preprocessing on the static graph before you enter batch mode.
# the graph can get acessed through alg.stinger()
alg.end_init()

# enter the batch mode
while(alg.alg.enabled):
  alg.begin_pre()
  # in this space you have access to the graph before the updates are applied
  # through alg.stinger() and to the updates that are about to be applied through
  # alg.alg.insertions, alg.alg.num_insertions, alg.alg.deletions, and alg.alg.num_deletions
  # the two behave like arrays / lists in some ways, but are not iterable, so use:

  for i in xrange(alg.alg.num_insertions):
    ins = alg.alg.insertions[i]

    # each insertion has the fields etype, etype_str, source, source_str, destination, 
    # destination_str, weight, and time.  these are not guaranteed to be populated (i.e.
    # if a stream is generating only strings, the source_str and destination_str fields will have
    # values, but the source and destination fields will only be 0.

  for d in xrange(alg.alg.num_deletions):
    de = alg.alg.deletion[i]

    # deletions are similar to insertions, but the weight and time fields have no meaning
    # and are thus not populated

  alg.end_pre()

  # if you would like to do any side computations while the server applies the
  # updates to the graph, you may do so here

  alg.begin_post()

  # this is the same as the preprocessing, but now the updates have been applied to the 
  # graph.  also, any unmapped strings will have been mapped.

  alg.end_post()

Apologies for the thin documentation, but these are rapid prototypes / internal tools that seemed like they might be of use to the broader community. If you are interested or have issues / questions, don't be a stranger - leave a comment below, issue on github, or send an email.

<< Go back to the previous page