Petit image
Back to Advanced
             
~ Deliverer: Distributed Framework for Bots ~
by Mhyst




Originally published @ searchlores in November 2004     Version 1.00, Updated in November 2004




Mhyst offers here, to every reader, a quite powerful 'queuer' bot, a sort of 'sandwich' for java small snippet bots that you may want to write yourself and that deliver will schedule and send around. This is not an essay for newbyes.
Sourcecode is provided. I am confident that whomever will take the time to study (and implement) this will for sure learn quite a lot of things about (java) bots. I hope that your feedback will flow, and that this first step will span a lot of new small searching bots.


Deliverer: Distributed Framework for Bots

A first draft

Introduction

This document is more a Request For Comments than a essay itself. Suggestions and criticism are welcome.

This essay is for everybody, but in order to create bots following the paradigm explained here you need some knowledge on Java.

Broadly speaking, a bot is a program coded on any programming language provided with some "intelligence" allowing him to perform some "human" tasks. For instance, visiting a webpage and registering a new account. Other example would be to query several search engines, to mix their outputs; possibly ordering it, removing duplicates and showing the results.

Generally all the code runs on the same machine. Let's imagine that some or every part of the code could be executed on different machines among a netword (LAN, Internet, mixed). That can be true with Deliverer, a framework for distributed bots.

How it works

Deliverer is a Java application that we have to "install" on every machine we want to run our code. Our bot's modules must be written extending a Java class called Translator as follows:


public class BotModule01 extends julk.net.deliver.Translator
{
	public boolean translate (String user, String service,
                                  String command, WorkResult owr)
	{          
	   /** Parameters
	    *  -----------
	    *  User: Contact address (email, IP, etc)
	    *        of the person who sent the request.
	    *  Service: Our module name.
	    *  Command: Exact command requested (with
	    *           service and parameters).
	    *  Owr: Results from the previous module.
	    *       We can retrieve it opening a file
	    *       called "owr.getName()".
	    */
	    
	   try {
	      /** Here we'll write our code.
	       *  We would process the file given
	       *  in owr parameter, or not.
	       *  We'll put the results of our code
	       *  in a file called "botModule01Result.dat"
	       *  or something like that.
	       */
	                                                         
	      /** Now that we finished our work here,
	       *  we have to declare our results, 
	       *  so that can be passed to the next
	       *  module as owr parameter.
	       */
			WorkResult wr = 
			new WorkResult("botModule01Result.dat",false);
			setWorkResult(wr);
			return true;  //All went fine
		} catch (Exception e) {
			return false; //Some error occurred.
		}
	}
}

We can create as much modules as we need or want by this procedure. The following will be compiling our new modules and copying them to the desired machines (among those with a running Deliverer). Remember to include the new classes on Java CLASSPATH.

Once we have all the code ready, how Deliverer manages it?

Deliverer was developed thinking about paralel working, balancing work ammount along the network, connecting processes transparently, etc. There are, probably, many ways to get it, but I thought of a "queue" model, and it's the way it works today. Each module is managed by a queue. And we can have as much queues as modules, or more queues than modules (better choice).

Deliverer can manage three kind of queues: local queues connected with a module, local cyclic queues connected with a module (in case of failure that kind of queue provide a number of retries); and remote queues which are connected to a remote node (such node consisting of a network reachable machine running Deliverer, with the same queue model but maybe diferent modules).

If we follow this outline, we can feel a recursive principle. Deliverer takes advantage of that. Another thing to consider is that each node can have assigned a main Deliverer (as a back path). This main Deliverer along with remote queues, provide wide routing in order to locate nodes able to process certain kind of information.

We can destinate one of several nodes of our Deliverer Network to serve as entry/exit points for submiting jobs, while hiding the rest of the DN (Deliverer network from here) and rely them to internal processing (which is a good idea). Finally a DN will look like a tree.

A little example

      
              Our Deliverer Network for this example
      
      +--------------------+
      |  Outer Node        |
      | ------------------ |
      | Remote Queue >===============+
      | Local Mailer Queue |         |    +-------------------+
      | ------------------ |         |    |  Inner Node       |
      | Mail Reading Bot   |         +===>| ----------------- |
      +--------------------+              | Local Filer Queue |
                                          +-------------------+

We'll see now how all works through a little example. So we need a simpler bot... Ok, let's assume we want submit jobs to our DN via email, and all we need is to get a file from Inner Node. This file would have to be located and emailed to the user. To do so, we don't need to code anything because I did the work yet, and we have these classes: julk.net.deliver.FileTranslator, julk.net.deliver.SMTPTranslator, julk.net.mail.bots.SPBroken. FileTranslator can retrieve any file from local hard disk; SMTPTranslator can send any result to user email address; SPBroken is a runnable bot that read his own mailbox every some minutes, and submit each line as a different job.

The syntaxis to submit a job is pretty easy. Every statement consist in a group of pairs "service_name#command", and all the pairs are separated with the character #. On each pair, the element "service_name" refers to a module, while "command" will be passed to such module and would help to identify different functions within that module, or even to pass parameters. In the case of FileTranslator (mapped as "filer"), there are two functions: list and get. The "list" function will retrieve a directory list of a given path. The "get" function will retrieve the file specified. When a job is submitted, Deliverer executes the pairs one by one, sending the result of the last processed pair to the next, until there is no more pairs. Generally the last pair is the responsible to send the answer to the user who submitted the job. The program could be this:


filer#get c:\autoexec.bat#mailer#send

Well, the user doesn't have to put #mailer#send, because it is set by SPBroken bot, assuming that a user submiting jobs by email would want to receive the results by email. Aside this, Deliverer will work attending to its configuration. But here, by now, let's assume all is well configured with two nodes: An entry/exit node (main Deliverer), with only two queues; the first one is a remote queue pointing to the second node, the second queue is attached to SMTPTranslator class (mapped as mailer). The inner node has only a queue, the one attached to FileTranslator class (mapped as filer). So, when such job is submitted to the main Deliverer, it asks (recursively) what queue is accepting filer jobs? The filer queue on the second node will answer "yes" propagating it upwards (to the main Deliverer). Then "filer#get c:\autoexec.bat" will be submited to the correct queue on the inner Deliverer. When that queue will end his work, will move to the next part "mailer#send": what queue is accepting mailer jobs? No one queue will answer, because there's only one queue, and it isn't mapped to mailer. Then, the question would be relayed to the main Deliverer, which will accept the job. Then, inner Deliverer will send transparently the file to the main Deliverer and the job will be submited in the second queue, mapped to mailer. Then all is done.

Configuration Files

There's no administration tools to configure our Deliverer network. Because that, we'll have to configure all the nodes one by one. For configuration purposes we have two ways: configuration files and telnet the port 4000. We'll see here the fastest way, configuration files. These files must be on the same directory that deliverer.zip.

First of all we have to map our class' modules to service names that, later, we'll link to queues. The file name for this is "mappedservices.cfg". It only contains lines with "service name/class name" as follows:


FILER                         julk.net.deliver.FileTranslator
MAILER                        julk.net.deliver.SMTPTranslator
NET                           julk.net.deliver.TCPNetTranslator

Second, queues configuration. The file name is "queues.cfg". This file is a bit more complex. Each line contains: "queue name/service name/queue type (C/l/R) [/retries]":


lmail                         mailer                        C    5
lnet                          net                           l    
lfiler                        filer_portatil                l    

Third, schedule configuration. Deliverer has a little scheduler which allows you to run some processes every some time. Deliverer uses this scheduler to execute a queue tester every 15 minutes, so that if a node gets down, it can be mounted autimatically later. Here we'll see how to setup SPBroken. This scheduler has his own configuration file: "schprog.cfg". Each line contains "class/start time/interval":


julk.net.mail.bots.SPBroken	now	600

Fourth, SPBroken configuration. SPBroken bot needs an email account. Such information is stored in "mail.cfg" file. Consider to create a new email account for this, because SPBroken will delete every mail after reading it.


POP3server=your_pop3_server
POP3port=110
POP3user=your_user_name
POP3pass=your_pass
SMTPserver=your_smtp_server
SMTPport=25
EMAIL=your_email_address

Deliverer classes and source code

Deliverer classes are in a zip file called "deliverer.zip". Download it and copy to c:\deliverer\ (or other directory), and copy there your configuration files.

You can run Deliverer as follows:


java -cp c:\deliverer\deliverer.zip julk.net.deliver.Deliverer

You can download and analyze the source code also. Feel free to modify or even rewrite it (on java or the language of your choice). In that case, I will be interested on your work. You can contact me in order to clear clues or something else related to my code or Deliverer work.

Security considerations

Deliverer uses a comunication protocol. You can fairly study it by running Deliverer on your machine and doing a telnet to localhost:4000. There is a "help" command that would be useful. Also, if you type "help(any command)" (without quotes), you'll get more information on that command syntax. Help is written in spanish... there's few text. I plan to translate all to english, but I'll need some time. Maybe you can guess many things reading just the syntax, and there are spanish/english dictionaries. Nevertheless, I'm here and you can send me an email. I'll gladly answer.

Some protocol commands are very sensible to security issues, so you'll need a password to test it. Default password is "frodo". You can set your privileges to admin by sending the following command:


setadmin(frodo)

Even considering this admin/user privillege level, there are still many security problems. Everybody knowing that service could get your inner files. My main aim now is to solve such security issues establishing a more robust security system. But security here depends more on your written modules. Obviously, we'll have to remove any "filer" service if we want to prevent someone getting our files.

One advantage is that Deliverer is not a wellknown application. If we avoid using "help" as a name for a protocol command, it would be difficult to get a clue on what's offering that misterious 4000 port.

Thank's fravia+ to publish it and thank you for reading it.




Published @ searchlores in November 2004     Back to advanced searching     Back to Bots


Petit image

(c) III Millennium: [fravia+] , all rights reserved and reversed