Thursday, March 28, 2019

Distributing Data Consistently (4/N)

In the previous post we figured out how to get Python to talk to etcd. Now let's do something with it.

One of the strengths of the previously-mentioned company where I used to work is that they invested a lot in shared infrastructure. This included developing a bunch of standard APIs that just worked without having to think a whole lot about them. That's one of the goals for this exercise. So let's start by seeing if we can put together a pain-free service locator.

Here's a simple, dumb implementation:

import etcd

class ServiceFinder:
  def __init__(self):
    self.etcd_client = etcd.Client(
      host=(('turtle1', 2379), ('turtle2', 2379), ('turtle3', 2379)),
      allow_reconnect=True
    )

  def get_service(self, service):
    return self.etcd_client.get('/services/' + service).value
and some mucking about in REPL:
>>> import service_finder
>>> sf = service_finder.ServiceFinder()
>>> sf.get_service('cdd')
'[ "turtle1", "turtle2", "turtle3" ]'

Look at how simple that is... import the module and ask it to tell you about the 'cdd' service.

"But wait, GG... you cheated. You hard-coded the etcd cluster addresses."

Yes, I did, but I wouldn't call it "cheating". Someone has to be the bottom turtle (see what I did there?). If you're building a system for service discovery then hardcoding the location of that system is perfectly justifiable. And in the real world you might do the hardcoding slightly more elegantly, such as through DNS SRV records. You can minimize the number of places where you have to hardcode things, but you can't eliminate them entirely.

Can we make it a little fancier, maybe have it tell us when the key changes?

Yes, yes we can:

import etcd
import threading

class ServiceWatcher(threading.Thread):
  def __init__(self, service, callback):
    threading.Thread.__init__(self)
    self.service = service
    self.etcd_client = etcd.Client(
      host=(('turtle1', 2379), ('turtle2', 2379), ('turtle3', 2379)),
      allow_reconnect=True
    )
    self.callback = callback

  def run(self):
    for event in self.etcd_client.eternal_watch('/services/' + self.service):
      self.callback(event.value)

class ServiceFinder:
  def __init__(self):
    self.etcd_client = etcd.Client(
      host=(('turtle1', 2379), ('turtle2', 2379), ('turtle3', 2379)),
      allow_reconnect=True
    )

  def get_service(self, service):
    return self.etcd_client.get('/services/' + service).value

  def watch_service(self, service, callback):
    watcher_thread = ServiceWatcher(service, callback)
    watcher_thread.start()
The above file (service_finder.py) defines two clases, ServiceFinder and ServiceWatcher. ServiceFinder provides the interface that programs can call, and ServiceWatcher is a sub-class of threading.Thread ('cause that's how threads work in Python 3.x) which will run in the background and watch for changes to a designated service. Here's a trivial code example:
#!/usr/bin/python3.6

import service_finder
import time
import threading

global thread_lock

def service_change_callback(new_data):
  thread_lock.acquire()
  print('Key change: ' + new_data)
  thread_lock.release()

thread_lock = threading.Lock()
sf = service_finder.ServiceFinder()
sf.watch_service('cdd', service_change_callback)

while True:
  time.sleep(1)
  thread_lock.acquire()
  print("Critical Section!")
  thread_lock.release()
Again, trying to keep things as simple as possible. Note that ServiceWatcher doesn't show up at all; the program doesn't need to know the gory details of how the threading works under the hood. All anyone needing to use the service_finder module needs to know is:
  • watch_service takes a callback which will magically get called whenever the specified service changes.
  • A modicum of locking is needed to prevent race conditions in the off chance that the callback is changing things that are used elsewhere (which is sort of the point of the callback in the first place).

Can it be made simpler? Can we hide the thread locking code somehow?

I don't think so. It would be simple enough to automagically wrap the callback in an acquire()/release() pair, but what about all the code that doesn't get touched by service_finder? No, the scope that invokes the service_finder module is the appropriate location to embody information about what the critical sections of code are. If I were writing production code the above would probably be turned into a module which could be invoked without worrying about the synchronization bits.

Anyway, there you go, that's the basic pattern. Once you've got that in place it becomes straightforward to perform all sorts of distributed operations in a safe and convenient matter.

0 Comments:

Post a Comment

<< Home

Blog Information Profile for gg00