Peering through the Fog

Sunday, 29 September 2019

A Fledgling Hatches

It has been a busy couple of weeks for the FogLAMP project and those of us at Dianomic. First we released a new version of FogLAMP, 1.7 in preparation for the PI World EMEA conference. We also then attended that conference and gave a training session for 30 people on FogLAMP and now this week we have been at the Open Network Symposium in Antwerp. The ONS saw the launch a Fledge, a Linux Foundation Edge project for Industrial IOT solutions. This project is in fact a rebranding of FogLAMP and move of FogLAMP to Linux Foundation governance.

The Linux Foundation LFEdge initiative already has a number of other projects whose ranks Fledge joins; Akraino, Baetyl, Edge X Foundry and Project EVE, each of which is targeted to specific areas of the edge device landscape. Other time these projects will be co-operating and fertilising from each of the specific areas they are targeted at to bring the best solutions to the open source world. one of the original reasons we started the FogLAMP project was to attempt to tackle the diversity and lack of standardisation in one specific area of the edge and fog computing worlds; namely extracting data from any sensor, machine or IoT device and enabling that data to as many designations as possible. A consequence of becoming a Linux Foundation project and collaborating with the LFEdge membership can only help to accelerate the standardisation process by providing free to use platforms for data acquisition.

This is a major development for us and the FogLAMP project, over the coming weeks we will be moving the source code into the Linux Foundation and transitioning our development work to Fledge. Of course this will take a little time and we will still continue to support existing FogLAMP deployments as well as developing new features that will become part of Fledge.

Thursday, 26 September 2019

An Industrial Week

In a break from filling in the back story of FogLAMP I thought I would write about my experiences and conversations I had last week at the EMEA PI World conference in Gothenburg. OSIsoft run two conferences a year, one in San Francisco and one in a city in Europe, this year it was the turn of Sweden and the city of Gothenburg. Dianomic, as a company part owned by OSIsoft and also aimed at producing software for use in this environment, usually attend these events. This year was no exception as we attended as an exhibitor and to give a training lab session on the last afternoon of the conference.

In my position as the architect of FogLAMP these events give be a good opportunity to both present that architecture to real industrial users and also to get a better sense of how what we are building might fit into their environment. This tends to make place in the exhibition hall, in which Dianomic has a presents, however before this can can the conference always starts with keynote speeches. Since the exhibition hall is closed these are the few talks that I manage to attend during the 3 days of the conference. The keynotes are given by OSIsoft themselves and by invited OSIsoft customers.

We, as the FogLAMP developers are very lucky that for the past few conferences we have been mentioned in the keynotes as part of the OSIsoft pervasive data collection story. FogLAMP being the open source, Linux collection mechanism in this strategy. This year was no exception, however it went slightly further we a section about how OSIsoft had used their own PI Data Historian within the automated headquarters building in San Leandro, the SLTC. FogLAMP formed part of that SLTC installation, with Phidget sensors being installed in the conference rooms to measure the light levels, noise levels, temperature and humidity. This data is fed into the building PI Server and used to verify the performance of the building automation system. Watching the software being demonstrated live on stage was a little nerve racking, but it went smoothly and served to illustrate the flexibility and usefulness of FogLAMP in situations we never envisaged.

Once the keynotes were complete the conference swung into its second phase, at least for me, manning the exhibition booth talking about FogLAMP and how it might fit in wit the various visitors projects or plans for the future. This is always interesting and throws up unexpected or new use case for FogLAMP. I particularly like hearing about what potential users are looking for as it helps form a future roadmap for FogLAMP or at least steer the priorities. The other thing that I like is to listen to how other portray what we are doing, I think it interesting how other with a different viewpoint depict the project.

One thing that is clear to me is that trying to describe the entire project in one go is not the best approach, there are so many options and components that it makes no sense. The usual approach most people take in describing FogLAMP is a bit like peeling an onion. The starting point is often a very high level description; it is a piece of open source software for connecting any sensor or machine to any data storage and analysis platform. Although, since this conference is all about the PI Server, that later part is often changed to be "to any PI Server or other system". The next level of description would then talk about the plugins for the sensors and protocols in the south service. This would include some of the examples of the plugins we already have, scheduled as modbus, OPC UA, DNP3 etc and also discussion around what it takes to add a new hardware device or protocol. This was always aided by having examples of hardware such as a Flir infrared camera and Advantech data acquisition modules on hand to illustrate the types of devices.

Buffering was the next level down into the architecture we would usually venture, expelling the reason for it and talking about use cases without always on or reliable network connectivity. We would discuss in relation to this that we can use the buffered data for accessing recent history and providing edge applications with that history. The example we always quote here is that of mining trucks that spend the day underground, gathering information and when they emerge they push all that accumulated data to the central data historian. I was given another example by one of our visitors, that of trains spending the day out on the network gathering data on the train and track and then transmitting that data for analysis, combined with the data from all the other trains, once it get back to the depot.

Our next level of discussion goes to the filtering capabilities of FogLAMP and the ability to modify the data as it traverses the FogLAMP services. This is an area that when we first started on the project we were told to avoid, the advice being send all the data unadulterated and let the transformation take place centrally. It has transpired this was not good advice for a number of reasons; some data is very high bandwidth and can not be buffered and transmitted sensibly so must be summarised using some algorithm at the edge. This could be because the data is truly high bandwidth, like vibration data or that the communication links are very slow or expensive. Another reason to modify the data at the edge is to augment it and add value at the edge. This allows for better edge analytics and letting to be performed. This then brings us on to the net level of discussion, the ability of FogLAMP, again via plugins, to run event notification rules and deliver notifications, via another set of plugins, at the edge itself.

Having the ability to do edge notifications is important in case where time is critical, the example invoked here is the self driving car. You would not consider gathering all the data from he sensors on the car and sending that data to the cloud in order to determine if it should apply the brakes. Of course we are not looking at utilising FogLAMP in the control system of a self driving car, but it may be used were it can alert to stop a process due to some measured values falling outside of an acceptable range for example. This need for events close to the edge is even more important when networks are slow or intermittent.

We also talk about how we are experimenting with adding machine learning to FogLAMP, either within a south service to use cameras as a way to create data, usually in the form of image classification, or we use machine learning to determine normal operating parameters and alert on anomalies.

The final afternoon of the conference is spent running a practical lab session, something we have done now at a number of PI World conferences, in each one the content of the lab becomes more complex as the FogLAMP project matures. This year we had a full house of 30 students in the lab, each of whom is given a remote control car with a Raspberry Pi Zero and an Envirophat sensor board attached. A USB battery pack provides the power for the Raspberry Pi and the Envirophat. The Envirophat has sensors on it for acceleration, a gyroscope, temperature and light levels, we use this to measure the movement of the cars and when it they pass under coloured lights.

The content of the lab takes the students through installing and configuring FogLAMP for basic data collection, attaching the FogLAMP to a PI Data Historian and then adding filters and edge notifications to the FogLAMP instance running on the car. We cap the afternoon off with having points baed game where everybody drives the cars around the floor trying to score points for exuberant driving and for driving under lights that are constantly changing colour, getting points based on the colour of the lights. Points are lost if the car is rolled. The points are all calculated in the central data historian and a live scoreboard is shown based on the data being collected from he sensors on the car. This is always chaotic and great fun, this year it was made worse because we forgot to labelled the tops of the cars and people quickly lost track of which car was theirs.

Wednesday, 11 September 2019

FogLAMP, the back story - part 2

In the first post in this blog I outlined the idea and major design blocks of FogLAMP, I wanted to continue with some of this back story and fill in a few more of the details behind the architecture and philosophy of FogLAMP. In later posts I intend to talk more about the things that can be done with it, some of the interesting discussion we have with users and potential users and discuss scenarios of how to glue all the bits together. These initial posts are worthwhile, I hope, as they cast a light on some of the reasons why things are the way they are.

The use of micro services was a natural for FogLAMP, so much so that it hardly needed talking about. The reasons behind this choice are the traditional reasons for using micro services plus one or two more specific reasons;

Isolation - we wanted to have the different micro services to isolation the different machines and protocols used to monitor sensors, buffer data or send it onwards.
Reliability - with isolation you also get reliability. A failure in one protocol or sensor plugin can not effect another. Also, if a micro service fails it can be restarted without impacting the other services within the system.
Scale out - as the system grows more micro services can be spun up to satisfy that growth. This may be either within a single machine or distributed across many machines.
Embedding - a rather different advantage than normally stated for a micro service architecture, but relevant to the Internet Of Things. By isolating functionality in small units it becomes easier to create embedded versions of those micro services within sensor devices themselves.
Ease of extension - micro services give a framework in which it is easy to extend the functionality of a system by adding more specialised micro services to perform tasks not initially part of the system. Because these services can act independently within the system there is no need to fully understand each service when adding new services for specific tasks; these new services interact with the rest of the system using well defined interfaces and hence are unable to adversely effect the operation of the existing services.
Asynchronous operation - in a system whereby radically different requirements exist for real time or near real time execution it is important that the implementation of one time critical function can not interact with another. This is particularly the case for the south interface to hardware and machine monitoring protocols.

Major FogLAMP Services

The other key decision made early on was the use of plugins to provide extensibility, the goal being to make it as easy as possible for any user of FogLAMP to extend FogLAMP to support a sensor, protocol or north system. It should be a matter of hours or days rather than weeks or months to add such support. Each type of plugin would have a minimal, well defined and easy to implement set of entry points, these should be limited to half a dozen at most. Additionally, in order to provide protection against misconfiguration of plugin this interface will allow the FogLAMP that is loading a plugin to determine if a plugin is the right type for the operation is being asked to perform. All plugins, regardless of where in the system they will be used support a base set of lifecycle operations;

Information - allows FogLAMP to obtain information about the plugin. This is not just what type of plugin it is, but also what version of the plugin API it supports, what configuration the plugin needs, how the plugin expects to be run and the version of the plugin itself.
Initialisation - starts the plugin operating and provides the initial configuration of the plugin.
Shutdown - terminates the plugin prior to a service or system shutdown or restart.

As well as these generic interfaces each class of plugin would also have its own additional entry points that were specific to the type of plugin.

One thing that concerned me from the beginning was how to make the configuration as extensible as the system itself. As new micro services or plugins are added the configuration of FogLAMP should also be extended, but extended in such a way as to make it look like there was one single configuration engine. Configuration became an early component that was designed and implemented, a number of requirements for configuration were drawn up;

New component, services and plugins, must be able to extend the configuration.
The configuration must be discoverable by external entities without those entities having prior knowledge of what configurable items might exist.
The system must be able to be operated in a 24/7 mode, therefore all reconfiguration must be dynamic. It should not be a requirement to restart the system or indeed a service within the system for new configuration values to take effect.
It should be possible to upgrade components of the system whilst the system is running. These upgraded components must be able to add or deprecate existing configuration to that of earlier versions of the configuration of the upgrade component. Values entered by users or administrators must of course be preserved during these operations.

To this end a component of the core micro service was designed to manage the configuration. Configuration data would be stored as JSON objects within a hierarchy of configuration categories. These JSON objects would contain not just the configuration data itself, but also meta data about the configuration item in question. This metadata would allow a client application, such as a graphical user interface, to discover what configuration was available and what rules might exist for the items. This would include a description of the item, a type for the item, constraints such as minimum value, maximum value, length etc. A default value for the item was also included. Later this meta data would be augmented with rules for validation of the item value and also for dependencies between items to be added.

Any component, be that a micro service, plugin or logical component within a micro service that wished to have configuration would create a configuration category for its configuration items. Upon startup of the component it would first obtain the current content of its configuration category. It would then merge its internal default category contents with that which it had retrieved from the configuration manager. This merging operation allowed for components to be updated with new data within the category and add that to the existing configuration data. The merge process would preserve user entered values for configuration items in the category that already existed, whilst adding new items, taking the value of the item from the default for the item as define in the components internal configuration category. This allows for the configuration to be updated without loss of user values. Once merged the component would set the category back into the configuration manager to allow it to be persisted for future executions. This functionality allows for plugins and components to not just extend the configuration of the system, but also to add and deprecate the configuration as components are updated.

The configuration manager also allowed for components to register interest in a configuration category. This meant that if a category was updated by another component, including the administrative REST API, the component would receive a callback from the configuration manager to inform it the category had changed. This callback mechanism not only worked within a micro service but also between micro services. The is the key part of the implementation that allowed for dynamic reconfiguration of FogLAMP.

The configuration was probably, rather unusually the first foundation of FogLAMP that was completed. It is however extremely important to the philosophy of extensibility, discoverability and always on operation that are fundamental within FogLAMP.

Thursday, 5 September 2019

Entering the Fog

The past two and a bit years have been a very interesting time for me, it marks a new direction for my work life at least, a move into the world of the Internet of Things, Fog computing and Edge devices. As we start the third quarter of 2019 the project I have been working on is about to enter an interesting new phase, but more of that when it actually happens. The event that prompted this change of direction in my life was the creation of a new company, Dianomic Systems and the start of a new open source project under the guidance of that company. I am lucky enough to be employed by Dianomic as the architect of the project, FogLAMP. The original concept of this project was to create something open source that could be used gather data from multiple sensors and device and push that data into a common store. The initial target for this data store was the OSIsoft PI Data Archive, Dianomic being part owned and funded by OSIsoft.

During our early research it soon became clear that there was an entire movement in edge gateways and Fog computing, and that any solution we designed in this area should really aim to be part of this ethos. Hence the Fog in the name of the project. The LAMP part came from the LAMP stack and a desire to build something that would help unify the bewildering number of different implementations and approaches to gathering data from sensors and devices. OSIsoft also had a desire to be able to gather data into their systems that was currently bypassing the traditional industrial control systems. The new breed of IoT sensors was a threat to everyone, data would end up in silos and a lot of the power of collecting that data would be lost without the ability to incorporate it into facility wide analytics systems.

It has long been a principle of mine that when designing any system I want to make it as configurable and extensible as possible. Having the ability to adapt and incorporate new requirements without having to do major redesigns or re-implementations to fulfil these new requirements. We knew we wanted to gather data from many different sources, using different mechanisms from protocol implementations to talking to raw hardware devices. This data would then be sent to the PI Data Archive to join the data lake within that system. The simplistic approach might be to construct a product for each data source, possibly making use of a set of libraries to implement the common features. Although possible this was distasteful in my opinion as it came with big support and versioning issues and it also made the task of supporting a new device a specialist activity. I wanted to make it as simple as possible for a new device or protocol to be supported. I fell back on my previous experience in architecting Maxscale, a MySQL proxy server, and decided we would make use of plugins for the device connections.

Once the decision was made to use plugins to interface with the devices and sensors it seemed natural to also use plugins for sending the data onward to external systems. Although in the first implementation there would only be a single system into which data was to be sent, it seemed like a worthwhile policy for the future proofing of FogLAMP. Now we had the second big extension point within FogLAMP, the plugins for sending data.

It was around about this time that we decided we needed a convenient naming structure, so we went for the compass point system that seems to be in use in a number of IoT systems. North for the route to the cloud and south for the devices and sensors. Hence we have south plugins to talk to devices and north plugins to talk to systems such as the PI Data Archive.

The next big component that was discussed was buffering. There where a couple of reasons we felt buffering was important

To deal with situations where the connection from the south devices to the north system was unreliable or not always available. It was felt FogLAMP would be running close to the south devices rather than with the north systems.
In order to adhere to the principles of Fog computing having some historical data, even if only for a short time frame, available locally would seem to be important.

The question was how to buffer the data, what semantics did we require and how much data do we need to concern ourselves with?

The answer to all these questions is the same of course, it depends. Given this how could we design a buffering system that would be all things to all people? We had already answered this of course, it was the same way we could support any kind of device to the south or any kind of system to the north. We would use storage plugins to provide the different scales and semantics of data buffering required.

The final addition to the functional blocks we had designed so far was a central core to orchestrate and manage these other components. Having defined the functional blocks the next question was how to implement it. The answer, to me at least seemed obvious, and again came from previous experiences in creating scalable solutions, use a mico-services architecture. I could envisage us needing multiple micro-services for the different south side devices and sensors, giving us isolation between devices and scalability. Likewise we could use the same approach to the north and provide for multiple destinations for the data.

Following this logic it seemed obvious the storage should also be done this way, with another service for the orchestrator. So hence the FogLAMP architecture was born. This is how it started and how it remains to this day. A few more specialist micro services have been or will be added to this structure, but two years on we are still building things within this basic set of services.