diff --git a/xml/docbook/stdf_manual.xml b/xml/docbook/stdf_manual.xml
new file mode 100644
index 0000000..6722412
--- /dev/null
+++ b/xml/docbook/stdf_manual.xml
@@ -0,0 +1,978 @@
+
+
+
+
+
+
+
+ Segmented Telemetry Data Filter
+
+ Administrator's manual
+
+
+ EduardTibet
+
+
+ 28.03.2022
+
+
+
+ Introduction
+
+
+ Scope of this document
+
+ This is a complete administrator's manual of the
+ Segmented Telemetry Data Filter (STDF) software. It describes in a brief
+ what STDF is proposed for, its overall design, what each component is
+ indented for. Also this manual includes a full information about an
+ installation process and usage of STDF. The theory and principles of
+ data filtering, explanation of the Erlang language syntax (used for data
+ filtering) are completely out of scope of this manual.
+
+
+
+ Document structure
+
+ This document includes a following parts:
+
+
+
+ - current
+ section.
+
+
+
+ - a
+ description of the software's overall design, features and
+ functionality.
+
+
+
+ -
+ the information about system requirements and installation of the
+ software.
+
+
+
+ -
+ current section describes, how to create and mastering filtering
+ rules required to be deployed into the one of the software
+ component.
+
+
+
+ -
+ section about customizing and fine tuning final data.
+
+
+
+ - list of
+ possible issues and ways to resolve them.
+
+
+
+
+
+
+ Description of the STDF
+
+
+ Brief description of the STDF
+
+ STDF is a data handling software designed to help in capturing
+ high speed telemetry data. The purpose of the STDF is to automatically
+ and linearly scale processing capacity for such data. The STDF segments
+ data into smaller chunks and sends them through a load balancer to
+ several servers that filter received data. That way it is possible
+ to:
+
+
+
+ avoid using a single high-powered processing unit working with
+ data;
+
+
+
+ reduce power of any unit, used for processing;
+
+
+
+ deploy the system with a great flexibility and scalability,
+ based on various initial requirements and/or conditions.
+
+
+
+
+
+ Overall design of STDF
+
+ The system contains of several parts:
+
+
+
+ coordinator component (node) - is used for smart management of
+ the whole system;
+
+
+
+ loadbalancer component (node) - is used for receiving raw data
+ from external sources (i.e. sensors) and transfer it further based
+ on coordinator's directives;
+
+
+
+ filter component(s)/node(s) - are used to process data
+ received from the loadbalancer. Processing is based on the current
+ workload. If it exceeds the maximum, defined by a coordinator, data
+ chunks automatically migrate to other filter nodes, which free
+ resources are enough to manipulate the data. The number of filter
+ components within installation varies and based on current
+ performance needs.
+
+
+
+ In the heart of the STDF is a proprietary protocol that was
+ developed by Teliota company. This protocol can be used between
+ components to coordinate data manipulation, calculation on individual
+ filters, running on each server, and data migration between
+ filters.
+
+ The typical workflow includes the following steps:
+
+
+
+ loadbalancer component receives all-raw data from external
+ sources (i.e. sensors) and transmit it further to filters based on
+ coordinator's current workload rules and internal logic;
+
+
+
+ filter component receives an independent dataset from the
+ loadbalancer and asks a cluster's coordinator to supply a filtering
+ rules;
+
+
+
+ coordinator provides a rules to the filter and then rules are
+ applied on-the-fly onto the incoming data, received from the
+ loadbalancer;
+
+
+
+ Each filtering component can talk to a coordinator component about
+ the data it is processing or wishes to process. The coordinator
+ component steers the loadbalancer component what data a loadbalancer
+ should provide to which filter node.
+
+
+ Overall design of STDF
+
+
+
+
+
+
+
+
+ If a filter component gets overloaded by the data, its tasks can
+ be offloaded to another filter node. Due to the nature of the workflow,
+ the algorithm assumes that:
+
+
+
+ a sufficient number of such redundant servers (filter modes)
+ exists in the pool as during an overload situation;
+
+
+
+ the offloaded data is similar to the original data and can be
+ filtered with same rules.
+
+
+
+ An offloaded filter node is, therefore, not "independent". It have
+ to process the same data and instructions as its peer until the moment
+ an overload situation is resolved.
+
+ New processing (filter) nodes can be added into the processing
+ cluster on the fly by:
+
+
+
+ adding new server hardware;
+
+
+
+ installing the filter component software onto it;
+
+
+
+ configuring the coordinator server address.
+
+
+
+ The filter node will register itself to the coordinator and the
+ coordinator will instruct the loadbalancer to forward traffic to this
+ new node.
+
+ Telemetry data and filter operations are defined with a definition
+ file that in turn is written in a proprietary filter rule language. The
+ language defines in details:
+
+
+
+ what the incoming data is stands for;
+
+
+
+ how the data may be aggregated and filtered out in case of
+ outliers or unwanted values are found.
+
+
+
+ The coordinator reads the filter language files and runs them on
+ its own logic processing engine. This engine is connected to all the
+ filtering nodes, which receives processing instructions in the form of a
+ proprietary, compressed command protocol. The protocol is
+ bidirectional:
+
+
+
+ filter nodes and the loadbalancer inform the coordinator about
+ data they receive and their status.
+
+
+
+ coordinator instructs:
+
+
+
+ loadbalancer - where to deploy initial raw-based
+ data;
+
+
+
+ filters - what data is and how that data should be
+ manipulated over.
+
+
+
+
+
+
+
+
+ Installation of the software
+
+
+ System requirements
+
+ To successfully install and run STDF, your base hardware/software
+ installation have to be complied with the following requirements:
+
+
+
+ Two (2) dedicated hardware servers for a coordinator and a
+ loadbalancer components;
+
+
+
+ no other application software (i.e. MTA, DB, etc.), except of
+ an operating system and system utilities should be installed on the
+ above servers;
+
+
+
+ required amount of servers that will be used as hosts for a
+ filtering components (nodes);
+
+
+
+ network connectivity with all sensors that gather information
+ for your application - your firewall rules should allow sensors to
+ access the STDF cluster (loadbalancer component);
+
+
+
+ network connectivity within all components of the STDF
+ installation and data receivers beyond the STDF deployment (DB or
+ third-party application servers);
+
+
+
+ any recent Linux distribution with a kernel 2.6.32 or
+ later;
+
+
+
+ standard (base) Linux utilities, including:
+
+
+
+ tar - utility to work with
+ .tar files;
+
+
+
+ wget - utility to get packages from
+ the distribution server;
+
+
+
+ any console text editors to edit configuration files -
+ i.e. vim, nano,
+ etc.
+
+
+
+
+
+
+
+ User qualification
+
+ To install and maintain STDF system administrator have to
+ have:
+
+
+
+ skills equals to those, that are enough to successfully pass
+ the LPIC-2 exam;
+
+
+
+ some knowledge of Erlang language syntax to write filtering
+ rules.
+
+
+
+ read throughly a "STDF filtering rules language reference"
+ manual (supplied by Teliota separately).
+
+
+
+
+
+ Installation process of components
+
+
+ Getting packages of components
+
+ All packages are to be downloaded from a Teliota distribution
+ web server: https://download.teliota.com
+ .
+
+
+
+ Installation of a coordinator component
+
+ To install a coordinator component:
+
+
+
+ Go the the top level installation directory.
+
+
+
+ Make a directory for coordinator's files:
+
+ $ mkdir stdf_coordinator
+
+
+
+ Change a directory to the recently created one:
+
+ $ cd stdf_coordinator
+
+
+
+ Download the package with a coordinator component:
+
+ $ wget https://download.teliota.com/bin/stdf_coordinator.tar.bz2
+
+
+
+ Untar coordinator component files:
+
+ $ tar -xjf stdf_coordinator.tar.bz2
+
+
+
+ Open configuration file config.ini in
+ any text editor and set up the IP and port that coordinator
+ component should listen on:
+
+ COORDINATOR_SERVER_LISTEN_IP=192.168.2.53
+COORDINATOR_SERVER_LISTEN_PORT=8860
+
+
+
+
+ Change directory the bin/
+ folder:
+
+ $ cd bin/
+
+
+
+ Check if the file stdf_coordinator.sh
+ have an execution bit turned on.
+
+
+
+ Run the coordinator:
+
+ $ ./stdf_coordinator.sh
+
+
+
+ The coordinator is needed to be fed by filtering rules. The
+ coordinator includes a separate language parsing and debugging tool
+ which validates a filter rule.
+ It is assumed that you have filtering rules already written.
+ If you haven't any rule written yet, first check the section .
+
+
+ To deploy a filtering rule:
+
+
+
+ Check the filtering rule:
+
+ $ ./stdf_parser.sh -i [rulefile1]
+
+
+
+ If there are any output messages - read them carefully.
+ These messages also saved within a log file for the future
+ analysis.
+
+
+
+ Copy the rule file to a filter_rules
+ directory within the coordinator installation:
+
+ $ cp [rulefile1] ../filter_rules
+
+
+
+ Open configuration file config.ini in
+ any text editor and add recently copied file into the
+ coordinator's configuration file:
+
+ COORDINATOR_RULES_FILES=rulefile1,rulefile2
+
+
+
+ Restart the coordinator component:
+
+ $ ./stdf_coordinator.sh restart
+
+
+
+
+
+ Installation of a loadbalancer component
+
+ To install a loadbalancer component:
+
+
+
+ Change a current directory to the top level installation
+ one.
+
+
+
+ Make a directory for the loadbalancer component
+ files:
+
+ $ mkdir stdf_loadbalancer
+
+
+
+ Change a directory to the recently created one:
+
+ $ cd stdf_loadbalancer
+
+
+
+ Download the package with a loadbalancer component:
+
+ $ wget https://download.teliota.com/bin/stdf_loadbalancer.tar.bz2
+
+
+
+ Untar the loadbalancer component files:
+
+ $ tar -xjf stdf_loadbalancer.tar.bz2
+
+
+
+ Open configuration file config.ini in
+ any text editor and point the loadbalancer to the coordinator's IP
+ address and port number:
+
+ COORDINATOR_SERVER_IP=192.168.2.53
+COORDINATOR_SERVER_PORT=8860
+
+
+
+
+ Change directory to the bin/
+ folder:
+
+ $ cd ./bin
+
+
+
+ Check if the file
+ stdf_loadbalancer.sh have an execution bit
+ turned on.
+
+
+
+ Run the loadbalancer component:
+
+ $ ./stdf_loadbalancer.sh
+
+
+
+
+
+ Installation of a filtering component
+
+ To install a filtering component:
+
+
+
+ Change a current directory to the top level installation
+ one.
+
+
+
+ Make a directory for filtering component files:
+
+ $ mkdir stdf_node
+
+
+
+ Change a directory to the recently created one:
+
+ $ cd stdf_node
+
+
+
+ Download the package with a filtering component:
+
+ $ wget https://download.teliota.com/bin/stdf_node.tar.bz2
+
+
+
+ Untar the filtering component files:
+
+ $ tar -xjf stdf_node.tar.bz2
+
+
+
+ Open configuration file config.ini in
+ any text editor and point the filtering component to the
+ coordinator's IP address and port number:
+
+ COORDINATOR_SERVER_IP=192.168.2.53
+COORDINATOR_SERVER_PORT=8860
+
+
+
+
+ Change directory to the bin/
+ folder:
+
+ $ cd ./bin
+
+
+
+ Check if the file stdf_node.sh have
+ an execution bit turned on.
+
+
+
+ Run the filtering component:
+
+ $ ./stdf_node.sh
+
+
+
+ Repeat above steps for all filter components are to be
+ installed.
+
+
+
+ Start feeding data into the data interface of the
+ loadbalancer component.
+
+
+
+
+
+
+
+ Authoring filtering rules
+
+
+ This section only briefly describes filtering rules structure. For
+ a detailed information take a look into the "STDF filtering rules
+ language reference" manual (supplied separately).
+
+
+ Filtering rules are defined utilizing a filtering language that uses
+ Erlang language syntax as a basis.
+
+ Each filtering rule includes three elements (so called
+ "definitions"):
+
+
+
+ data definition - describes nature of data to be filtered,
+ including the pattern how the incoming data can be recognized (e.g.
+ port, input url, data header); the data definition assigns an
+ identifier to the dataset so that the data correlation and filter
+ rules can refer to it;
+
+
+
+ correlation definition - describes how that data depends on
+ itself or some other identified dataset;
+
+
+
+ filter definition - describes what actions are to be taken for
+ the data, when it arrives.
+
+
+
+
+
+ Using and verifying filtered data
+
+ The filtering cluster appoints one of its nodes automatically as a
+ forwarder, based on the load of the servers. The forwarder collects the
+ data from each filtering node, combines it into one stream, and sends it
+ to whatever server is designated as the final receiver
+ (destination).
+
+
+ The filtering components (nodes) don't store any data - they
+ only perform filtering. You have to define and configure the storage
+ server beyond the STDF deployment that will perform any and all
+ database processing. A connection to a designated DB server is
+ configured within a coordinator component configuration file
+ config.ini.
+
+
+ The forwarder can optionally inject additional data headers and
+ trailers into the initial data block for easier recognition of its nature
+ - source transmitter/generator. The trailer may contain a CRC for checking
+ data integrity. The algorithm for the CRC is shown below:
+
+ def crc16(self, buff, crc = 0, poly = 0xa001):
+ l = len(buff)
+ i = 0
+ while i < l:
+ ch = buff[i]
+ uc = 0
+ while uc < 8:
+ if (crc & 1) ^ (ch & 1):
+ crc = (crc >> 1) ^ poly
+ else:
+ crc >>= 1
+ ch >>= 1
+ uc += 1
+ i += 1
+ return crc
+
+crc_byte_high = (crc >> 8)
+crc_byte_low = (crc & 0xFF)
+
+
+
+ Troubleshooting
+
+
+ Problem: no connection from a filter node to a
+ coordinator
+
+
+
+
+
+ Possible reasons
+
+ How to solve a problem
+
+
+
+
+
+ Any of coordinator's node IP settings of a filter node
+ are not correct or were not set.
+
+ Check for a correct IP and port numbers of
+ filters.
+
+
+
+ Firewall rules don't allow filter packets to reach a
+ coordinator
+
+ Check if coordinator firewall settings (open ports and IP
+ rules) are correct.
+
+
+
+ Coordinator node is not running
+
+ Check if coordinator is really running.
+
+
+
+
+
+
+
+ Problem: filtering node doesn't receive filtering rules
+
+
+
+
+
+ Possible reason
+
+ How to solve a problem
+
+
+
+
+
+ Any of coordinator's node IP settings of a filter node
+ are not correct or were not set.
+
+ Check for a correct IP and port numbers (see above
+ problem's first solution).
+
+
+
+ Errors in filtering language
+
+ Check coordinator's log file for errors
+
+
+
+ Issues with network connectivity or software used
+
+ Check coordinator's log file for errors; check node
+ firewall settings
+
+
+
+
+
+
+
+ Problem: filtering node doesn't receive data
+
+
+
+
+
+ Possible reason
+
+ How to solve a problem
+
+
+
+
+
+ Loadbalancer is not running
+
+ Check for errors in loadbalancer log files
+
+
+
+ Ports are close or filtered by firewall
+
+ Check node firewall settings
+
+
+
+ There are no actual data received
+
+ Check loadbalancer log file of transmitted data
+
+
+
+
+
+
+
+ Problem: loadbalancer doesn't receive any data
+
+
+
+
+
+ Possible reason
+
+ How to solve a problem
+
+
+
+
+
+ Loadbalancer is not running
+
+ Check if loadbalancer is running and check for errors in
+ loadbalancer's log files.
+
+
+
+ Ports are close or filtered by firewall
+
+ Check loadbalancer firewall settings
+
+
+
+
+
+
+
+ Problem: Filter produces incorrect results
+
+
+
+
+
+ Possible reason
+
+ How to solve a problem
+
+
+
+
+
+ Incorrect filter initial setup
+
+ Run node with higher level of verbosity: start them with
+ ./stdf_node.sh -vvv and then check log
+ files for possible issues
+
+
+
+ Incorrect filter rules
+
+ Run filter language parser and validate it's actual
+ syntax: run ./stdf_parser.sh --validate
+ [rulefile1]
+
+
+
+
+
+
+
+
+ Technology stack behind this sample document
+
+ The source files of this document:
+
+
+
+ were completely written in DocBook/XML 5.1
+ format which is OASIS
+ Standard;
+
+
+
+ were WYSYWYM-authored by using of XMLmind XML
+ Editor version 7.3 by XMLmind Software installed
+ on author's desktop running Debian GNU/Linux 10.11
+ (buster). Also author used Dia Diagram Editor for
+ diagrams.
+
+
+
+ are freely available at Github as a docbook-samples
+ project;
+
+
+
+ are distributed under Creative Commons License - for details see
+ .
+
+
+
+ To produce .fo file of this document the
+ following software were used:
+
+
+
+ The local copy of DocBook XSL
+ Stylesheets v. 1.79.1 was used.
+
+
+
+ Author's customization layer of the above stylesheets that is
+ now a docbook
+ pretty playout project, freely available at Github.
+
+
+
+ xsltproc as an engine to produce
+ .fo file from the DocBook source
+ .xml file (xsltproc compiled
+ against libxml 20904,
+ libxslt 10129 and libexslt
+ 817).
+
+
+
+ To get the result .pdf file from a
+ .fo file author used Apache FOP 2.3
+ engine with a foponts
+ project, created and maintained by the author of this
+ document.
+
+
+
+ License
+
+ This work is licensed under a Creative
+ Commons Attribution-NonCommercial-ShareAlike 4.0 International
+ License.
+
+
diff --git a/xslt-out/result1.xml b/xml/shakespear/allswell.xml
similarity index 99%
rename from xslt-out/result1.xml
rename to xml/shakespear/allswell.xml
index 86863e5..bb7590d 100644
--- a/xslt-out/result1.xml
+++ b/xml/shakespear/allswell.xml
@@ -1,3 +1,4 @@
+
All's Well That Ends Well
diff --git a/xml/lines7000.xml b/xml/shakespear/lines7000.xml
old mode 100755
new mode 100644
similarity index 100%
rename from xml/lines7000.xml
rename to xml/shakespear/lines7000.xml
diff --git a/xml/play.dtd b/xml/shakespear/play.dtd
similarity index 100%
rename from xml/play.dtd
rename to xml/shakespear/play.dtd