Showing posts with label supercluster. Show all posts
Showing posts with label supercluster. Show all posts

Monday, March 08, 2021

Building the Data Lab Technology Stack

The idea of building a data lab is emerging from my ocean data conversations and how to best utilize my knowledge and skillset within this opportunity. In my mind, the service offering would be twofold;

  1.  Data Engineering / Software Development consulting and services with focus on ocean data. We will do the heavy lifting of extracting, cleansing, transforming, and loading your data. And then we will help with analysis and visualizing the data. We are comfortable working in both the open source and Microsoft technology stacks.
  2. Standing up (and data loading) the technology stack for the data lab. You are going to need to host all this compute power and storage somewhere. It could be on-premise. Most likely, it will be in the cloud. We can help with this also. We could build it in Azure, using the Microsoft technology stack. Or we could build it using an Open Source stack on top of Linux in any of the hosted environments of Azure, AWS, or Rackspace.

https://stock.adobe.com/
How do you build a low priced, large compute, technology stack to support data engineering efforts, implement a data lab, and showcase these new services capabilities. The low price is the key factor given the current startup state of this ocean data endeavour. Particularly, when you think of the cost of compute for processing and storing large amounts of data. I believe the the best way forward is as follows;

  1. Use open source where you can. Fortunately, many of the infrastructures, tools, frameworks, and programming languages for the data lab are open source.
  2. Automate the build so it can be built and torn down with ease. This would eliminate the need for the stack to always be running.
  3. Store the data at it's source, if possible. Fetch, and load, the data when you automatically rebuild the stack. Keep in mind this limits the amount of big data you can store locally, and loading large amounts of data can cumulatively take days. Be mindful of this. 

Note: this stack is to showcase the services capabilities. A full data lab would also need the ability to both persist and fetch data. It's going to take some time to build the data lab!

The Data Lab Technology Stack

The deployment of this technology stack will use open source wherever possible running on a Linux (Ubuntu) Server hosted at Rackspace. The rational for these decisions are;

  • Little to No licensing costs 
  • Strong familiarity with Rackspace as hosting company
  • Existing domain name (endeavours.com) hosted with Rackspace
  • Extensive experience with Ubuntu Linux in a hosted environment
  • Familiarity with deploying data intensive solutions using the ELK stack
  • Experience programming in Python

Note: The deployment of this technology stack will happen in phases, where each phase will complete with some basic tests to ensure the stack behaves as desired.

Phase 0

Phase 0 will be a basic ELK stack running on an Ubuntu Linux server hosted at Rackspace with access via the endeavours.com web domain. The use case for where the data comes from, how we transform it, the analysis, and visualization is still to be determined. This use case will be used for testing this first iteration of the newly stood up data lab. Exciting times!

Phase 1

During phase 1 we will add the Python programming language to the technology stack and use it for two purposes;

  • Apply a model to the data using Python.
  • Present the processed data to a web page for display.

Phase 2

During phase 2 we will add Kafka as an infrastructure resource, identify some additional data sources, and pre-process the data before it gets loaded into ElasticSearch.

Phase 3 and beyond

Investigate the Apache Data lab stack, add Spark to our lab, add a data workbench...

Friday, March 05, 2021

ODE Newsletter - February 2021

I'm 7 people into working towards my 100 conversations. It is said that you need to have 100 conversations as you solidify your business / startup idea. So this is where I am, seven conversation in. If you know anyone who works with ocean data or works for a business that has an interest in oceans, I'd love to talk with them.

Given the time restraints of being deep into a large data / database migration project, I consider February has been a good month for conversations. It provided me a good view into the horizon of ocean data. I followed the conversations that were presented to me without me directing the focus. For this is the first month, and I have yet to gain clarity of the gaps of where I need more information. This makes sense given I am at the beginning and don't know what I don't know. Now that it is the end of February I have identified the need to talk with customers of ocean data. This could become a focus for March. The conversations for February unfolded in the following order, with the following summaries and highlights;

PropelICT (https://www.propelict.com/)

I reached out to a past co-worker in a leadership position within PropelICT. PropelICT is an Atlantic Canada e-accelerator for tech startups. The conversation was very encouraging and initiated my application to their April cohort. Looking forward to their support in the coming months (and years).

Highlights: 

  • The idea of 100 conversations.
  • My first suggested conversation contact. 
  • Being a candidate for their e-accelerator.

eOceans (https://www.eoceans.co/)

I spoke with one of the principals of eOceans. Time very well spent, Thank-you! So many details to be digested from this conversation. This organization clearly understands ocean data and where it intersects with social media! A bulleted list seems the best to call out the highlights;

  • There are many open standards and organizations working in this space. The data standards seem to be "standardizing" and there are many organizations working toward bringing the data standards together. More open organizations are contributing than the closed proprietary types. CIOOS is the standout for Canada. EU and US are much further down the standards and open data path than Canada.
  • Both ends [(data storage and end-points (IoT)] of the data collection are well serviced with lots of business and startup activity. It's the middle were the greater opportunity exists. It's with the data integration with consideration for all the standards and granularity. "It would be nice to dust off a 10 year old data set and be able to easily use it".
  • Working with ocean data initiatives is very project based and finding the revenue sources / the business model for an open reference architecture for the digitization of oceans could prove difficult.

Highlights: 

  • Many open organizations already working in the ocean data space. 
  • The business side of what you are exploring (reference architecture) may be difficult, so much work is project based and gov't funded. A reference architecture seems like an NGO or consortium kind of thing.
  • Middle ground of software and data integration could be a big need given my skillset.

Mentorship

Super fortunate to reconnect with an older friend who has loads of experience; small devices, programming, data, startups to a favorable exit, machine learning, etc... many skills that align well with what I am doing. And on top of all this, I really enjoy the meandering conversations we share!

The one area where there is a strong overlap towards my ocean data focus and the mentors previous experience with the integration of data. And yes he confirmed, integrating data from different devices to a common standard is a lot of work for creating a single view into a broad data realm.

Highlight: He agreed to provide me mentorship within this endeavour. So great!

New Brunswick Ocean Strategy: Our Opportunities in the Blue Economy

This was an excellent online conference put together by the Ocean Supercluster. What I did most was listen, and a good thing too... I have so much to learn. I really liked the breakout sessions where there was more individual participation. Some names, and acronyms are becoming more familiar too me. 

Highlight: A small list of contacts I could reach out to. All good!

TechNL (https://www.technl.ca/)

I spoke with one of the leaders in TechNL and we talked about what I am wanting to do with data, in particular, ocean data. The conversation pointed towards two relevant contacts;

Highlight: That if I am going to be successful in this endeavour I am going to need partners. The time required for setting up an organization isn't the best place for me to be focusing my time at this stage of the startup. And given the nature of this startup needing to work in the open, the partnership route may be the best way to go...

Canadian Integrated Ocean Observing System (https://cioos.ca/)

So fortunate to have the attention of two CIOOS employees! They were so gracious a provided a broad and deep amount of information regarding the state of ocean data. Super helpful! CIOOS clearly knows the data. The best way to summarize my conversation is by including the important questions and there answers;

With ocean data where is the greatest pain?

Resources as in financial and skills / knowledge.

At the more general project level; governance and the people who know how to organize and stewardship data through its lifecycle. This is more a reference to the industry in general... it's a project issue. And having the ability to integrate with a project that happened years ago...

Do open data standards have an influence?

Absolutely! There are many references to open data. Most of what we deal with are open.

How easily integrated are the existing data sets?

It’s getting better. It can be difficult to get an older data set and want to integrate it. These older sets often lack the granularity or metadata that makes it easier to ingest. There is a definite need here at a project level. Developing an expertise here could become a strong business.

Most initiatives within this space are project based. Which makes it difficult for longer initiatives that have some data sustainability. Rarely are there long term funding initiatives.

Highlights: 

  • So many acronyms, references and URLs. The CIOOS folks provided me many references all pointing in the right direction. Reference to some of the ISO standards. 
  • The need for better stewardship of data so as data ages it still has usefulness.

Pisces Research Project Management (https://piscesrpm.com/)

Another fortunate conversation with a person deep into ocean data and with the added bonus of being very technical. This was a contact I harvested from the New Brunswick Ocean Strategy Conference. There are may topics I could summarize from this outstanding conversation, much of the information confirmed things I discovered from the previous conversations described above. This is good!

I did pitch my idea about mooring buoys as a fixed points of data collection, and having these buoys like the personalized weather stations that have become so popular. This employee loved the idea.

The exciting part of this conversation was the discussion of the technical stack used within the open data within the oceans sector. It was good to add this to the knowledge I had of the proprietary technical stack used when I was managing the software engineering dept. at Provincial Aerospace.

What is the most common tech stack for Ocean Data?

This person has extensive experience working with Government Organizations and Academics. From what they have seen the most common, and emerging, technology stack includes;

    • Python
    • Assorted data storage approaches. Often NOT an RDBMS.
    • QGIS is common.

These are the tools he finds most effective and common. Using QGIS pushes you into the geo representation of data. Much ocean data requires different kinds of models, more 3d, more oceans… not necessarily geographic, etc.

The ability to prove models with real data is the biggest need from a technical perspective. This is why python has such good traction. It is easy for non-programmers and also rich enough for programmers. A good language for data, and useful across the technical skills working with data.

NetCDF is the most common data-store. Also CSV and proprietary data storage. Remember data people are mostly not programmers or overly technical.

Also take a look at CKAN (https://ckan.org/)

What are people looking for from a technical perspective?

    • Proving models with real data.
    • Integrating data

Highlights: 

  • A deep discussion about the technical stack. The preferred programming languages, data storage, integration approaches, and technical issues.
  • Confirmation that integrating data and proving models is an area of software development opportunity.

Lessons Learned

  1. A reference architecture for the digitization of oceans is not enough to hang a startup or business upon at this time! Where I do believe it is still a good idea that will form through time. There is so much work already going on for a common open architecture that another doesn't need to be started. I truly believe a reference architecture will emerge, it is a; when it will happen, not if it will happen.
  2. There is a big need for technical and software development skills and knowledge in the data engineering space of ocean data. I believe the opportunity exists for a software development / data engineering consulting firm with the specialty of ocean data.
  3. The idea of an anchored (or fixed) buoy for ocean data collection is very compelling too me. Kind of like the personal weather station but as a fixed mooring buoy. Anyone who has a mooring buoy could replace it with the data buoy, and have real-time data about the conditions at the buoy in preparation for mooring.

Next Steps

  1. March will be the month of broadening my reach. I need to talk with a broader section of people working in the oceans space. I need to find potential customers for the processing and software development in, and around, ocean data. 
  2. I need to start building software tools for the processing of ocean data. I need a reference technology stack showcasing our abilities to work with data.
  3. I need to start developing an elevator pitch for the ocean data software consulting firm. I need customers and revenue to get the real feedback to focus the business mission.

Sunday, February 21, 2021

The Beginning of Ocean Data Endeavours (ODE)

Thirty-nine months ago I started on a deep dive into developing a reference architecture for the digitization of oceans. The idea of developing this reference architecture was initiated by the Canadian Government awarding Atlantic Canada with the Ocean Super Cluster initiative and all my recent work with leading the software engineering group at Provincial Aerospace. My writing and research into ocean data took me to the point where I needed to deepen my understanding of a number of subjects, and I needed this deeper understanding before I could continue the writing and research (even though you could consider deepening understanding as research). I needed to have an intermediate understanding of what had come before and the current state of things with a reference architecture for the digitization of oceans. In particular, I needed to work directly with ocean data and the standards that influence its structure.

Over the last three years I have been lucky to work with Triware Technologies Inc., and together we have found projects that align with this need to deepen my understanding of all things digital and all things ocean. My recent project successes include;

OCIO Digital by Design - I was fortunate to be awarded the opportunity to be the data architect for the initial phase in digitizing the NL governments citizen facing portal. I remained on the project for the first 12 months through to the portals launch. Being on the design team to create the data tier and integrate with legacy data was a great achievement. And I deeply enjoyed using a scrum / jira approach with a multi-vendor, multi-disciplined team. We achieved a lot in a short period of time.

Lessons Learned - Agile, Scrum and Jira can scale well to a government organization with multiple scrum teams working toward an integrated solution.

Ocean Sector Search - We needed a way to index the Canadian Ocean Sector. So we built a search engine seeded by as many oceans related URLs as our analysts could gather. The technical architecture of this ocean specific search engine can be found in this previous post.

Lessons Learned - With reasonable technical effort Nutch can be configured and seeded to crawl a specific industry sector (in this case Canada Oceans Sector). The Nutch crawl harvested a significant number of pages (> 32000) that were then loaded into the ElasticSearch (ELK) stack while relevancy scoring each page along the way.

NLCHI - My work with the Newfoundland and Labrador Centre for Health Information (NLCHI) was a quick engagement to focus their requirements backlog into a few manageable sprints. I was super fortunate to help get an important project underway and gain insights into the concept of a customer focused data workbench for a specific subject domain.

Lessons Learned - The idea of a personal data workbench is very compelling when you consider the number of data sets already available in the oceans sector. And if we could fold in open and proprietary data sets, while honoring security and privacy we may be onto something...

Nalcor Energy Database Consolidation - So many databases, so little time. One of my favorite enterprise type projects is when the project pays for itself, over time, by the savings created by the projects downstream accomplishments. Not revenue generation, but operational expense savings. I believe one of the best KPI's for IT is not new systems implemented, but old systems retired.

Lessons Learned - an amazing amount of data can be moved with the correct use of tools, well built and managed ETL (pipelines), and a mindset of automation.

Argo Floats - 2018

NEXT STEPS

Over the last month I have revisited how to best develop my intermediate understanding of oceans data. After a number of conversations, with experts of oceans data, I believe my next steps are twofold; I need to focus on the existing standards for oceans data and I need to write some code to integrate some open oceans data sets.

I need to find opportunities to work directly with oceans data. If you are in the oceans sector, in any way, and you have the need of a very experienced data engineer, then I would love to help with your project. If you know of an oceans data project in needs of a data engineer, please forward on my credentials. Thanks to everyone for reading this far. And thank-you Triware for your ongoing career support!

Tuesday, November 26, 2019

Ocean Sector Specific Search



I've recently finished building an industry specific search engine. The primary use case is to drive international and domestic business traffic to the Canadian websites doing business within the oceans technology and innovation sectors.

From a technology architecture perspective we built a search engine for the Canadian oceans super cluster initiative where all components run, and are based, upon Canadian assets hosted in Canada. We seeded the search engine using the URLs for all the organizations identified as participants within this economic sector. The indexing process analysed each URL and followed all links up to two hops deep. All the identified URLs were scored using a web graph and the top pages were indexed.

The architecture decisions
The NELK stack became our back-end infrastructure.

A number of important architecture decisions were made early on as the project was detailed. Mostly decisions were made to support the technologies that the small team was already familiar. And if the team wasn't familiar, we chose technologies that had the most industry support and local resources in our personal networks so we could help out if we needed. We ended up having Nutch feeding the ELK stack using Wordpress for the UX. In the project it became known as the NELK stack.
  • Nutch - for web crawling and first round of web page extraction and cleanse.
  • ElasticSearch (ES) - as the search engine / data manager
  • Logstash - as the data transform and load.
  • Kibana - as the administration / developer console
Crawling the web with Nutch
We ended up using Nutch to crawl the internet for ocean sector specific web pages. We also needed to integrate with ElasticPress so the broader ecosystem search included the contents of our websites Wordpress database. To do all this we ended up using Nutch 1.15 for it integrated best across our technology stack. We used the Nutch recommended approach seeding, ingesting, fetching, and duplicate removing as we prepared the data for export to ElasticSearch. Due to versioning issues we exported the Nutch database to CSV before importing the data. For the first load of data our use of Nutch created the following page loading metrics;
  • seeded with 2612 domain names
  • removed 709 duplicate or in error domain names
  • identified 86872 candidate webpages 
  • fetched the 29323 most relevant web pages (based upon web graph algorithms)
  • indexed 29270 pages into ElasticSearch
Loading data with Logstash, inspecting the results in Kibana
We used Logstash to bring the Nutch exported CSV data into ElasticSearch. Coding up the logstash job was fairly easy, the most important aspect was choosing the correct logstash filter. It was better to use the dissect filter rather than the csv filter. More on this in a later post. In the end, I was amazed with how quickly Logstash loaded, and ElasticSearch indexed, all the data.

Once all the data was loaded into ElasticSearch I used Kibana to confirm data was correctly loaded into the data repository. Kibana has a very intuitive interface and creating filters and running queries to confirm the successful loading of data was straight forward. I look forward to using Kibana to manage the repository and create meaningful dashboards.

Integrating ElasticSearch with WordPress PhP


Integrating with Wordpress
Once we had the back-end built and loaded with industry specific web pages we still needed to find the correct tool-set to provide a query interface within a Wordpress template and to integrate with the organizations identified in the Wordpress database. We wanted the ecosystem search to include both what we had indexed from the internet and the organizations listed in our directories database. The solution ended up using two solutions;
  • The ElasticSearch (ES) PhP library which provides a mature (and easy to use) set of features to build your own interface into ES using PhP.
  • ElasticPress which allows automated ElasticSearch integration with a wordpress database.
The Wordpress / PhP tools for integrating with ES are very effective. ElasticPress has automation that keeps it up to date as changes are made within the Wordpress database. The ES PhP library has a robust set of features that makes for easy coding and kept search performance very high. Even with large query results the ability to traverse the result set forward and back was easily handles with features available in the PhP library.

In conclusion, using Nutch with the ELK stack provides for a very strong search engine that integrates easily with Wordpress on the front-end. The learning curve for this approach was not overwhelming and whenever challenges presented themselves the online groups help us resolve issues within days.

Special thanks to the team put together by Triware Technologies. Without all the other technical people, analysts, business people, data entry, project managers, Oceans Advance, ACOA, Ocean Super Cluster, ElasticSearch support, Azure support, and those clearing the way... none of this would have been possible. Thank-you!




Thursday, July 05, 2018

Seeking New Opportunities

TLDR;
Veteran technology professional seeking new opportunities. If you are requiring 35 years of career success in the software technology realm that spans startups through to large enterprise environments, then I'm your guy!

I have tonnes of experience and I can hit the ground running. I can work as a senior project manager, an enterprise solution architect, as a scrum master and build a team focused with agile approaches, I can design databases and related structures for your business analytic projects, and I can own your Business Intelligence initiative. I can provide strong, and proven, leadership. And if need be I can occupy the technical director level. You will find I provide great business benefit and good value, my experience provides the ability to save you more than I would cost.

A history of project success
I'm back to consulting in the technology realm. Being a technology consultant has provided my clients, and myself, many project successes. These successes have been both with technology startups and with medium to large business environments. For more details describing some of my project successes please read these two posts describing projects I have managed and provided technical leadership over the last decade;

Increasing Access to Education - this collection of eight projects helped CLEBC become one of the globes premier legal educators.
Career success in three year cycles - These four major corporate initiatives leveraged all my abilities to help PAL into their next level of technology capability and realized business value.
Where I can help the most
I can have an immediate positive impact to any technology initiative within large and small organizations. This success can come with startups and enterprise environments, or something new that can leverage my skills, knowledge, and experience.

There are five roles to where I can bring the greatest immediate business value;
  1. Enterprise and Solution Architecture - The solution side of designing technical architecture has most often been my preferred role within a software development initiative. I've done this for small startups and large enterprises. Good solution architecture creates a better technical solution that saves development costs, improves quality, and ensures alignment with business needs and the other technologies within the business ecosystem.
  2. Senior Project Management and/or Managing a Software Development Teams - Through time I have managed many projects with software development teams varying in size from 3 to 30. I have proven track record of bringing projects to completion on-time and on-budget. I seriously enjoy assisting a team of technical people to ship software.
  3. Data Analytics Project (Architecture and Team Management) - The B.Tech undergrad degree I completed in the 80's was focused on database management. And since that time my career has had data as its foundation. I do very well with data management [including the design of data structures and related communications (read API's)]. I can also work well within big data projects as my experience with big data goes back to the 1980's. Read one of my big data posts from a few years back to get a sense of this capability; Big Data; Similarities and Differences.
  4. Technical Mentorship - I'm an educator and have experienced many startups during my time in Vancouver. If you want to discuss the technical side, the team management side, and/or the development methodology to an innovation project (startup or otherwise) I can help here.
  5. Innovation - Many times in my career I have taken business strategy and made it happen. If you have a business strategy that requires software innovation I can make it tactical and implement. 
I'm willing to travel
A single flight from St. John's Newfoundland is my preference. I would also be very interested in consulting work in Vancouver (my home town). I have family there and could easily have all the amenities I require for extended working stays. In a previous blog post I describe how I consider the size of the St. John's marketplace to be 40 million people. I also need to consider the St. John's market to be a single flight away as I seek out new challenges and opportunities. My list of cities I will immediately consider consulting opportunities, include;
  1. Halifax: it's a morning flight away, and can be there and back including a full days work. I'd also stay for a few days if required.
  2. Toronto, Ottawa, Montreal: single 3 hour flight, couple of days then return home for remote working
  3. Vancouver: longer stays and longer aways... I have family in Vancouver, so it would be a very nice approach to working.
  4. London, England: An outlier, but... It would be nice to work in the EU and I have a British passport so it would ease my travel and work permitting.
One personal constraint
To be succinct, I require work-life balance... mostly it is about having time to care for middle-school age children and aging parents. This means I am available part-time or full-time with loads of schedule flexibility.


Wednesday, January 03, 2018

The oceans data collection platforms

My last post from the end of 2017 spoke to the number of sensors and devices that are currently, and becoming easily, available for data collection. This previous post focused more on the generic types of end-points and sensor technology. In this post we focus on the oceans and look at the variety of end-points used in the maritime environment.

Underwater Sensors
As previously identified there are a growing number of sensors and devices available for collecting data in all types of environments. This is also true for the growing number of "platforms" in which these sensors can be deployed. When we extend our view of sensors to include those used in oceans the list of data capture technologies grows. What is important, beyond the details of the data being emitted by these sensors, is how many are becoming a part of the digitization of oceans data "ecosystem". And with so many sensors, and therefore, so much data... a standardized approach and reference architecture defining the approaches to coalesce all this data will become increasingly important.

The oceans data collection "platforms"

Weather buoys
Weather buoys are instruments which collect weather and ocean data within the world's oceans, as well as aid during emergency response to chemical spills, legal proceedings, and engineering design. Moored buoys have been in use since 1951, while drifting buoys have been used since 1979.


An Integrated Ocean Data System 
Spotter is a web-integrated solution for collecting ocean wave and surface current data, designed from the ground up to be easy to use, intuitive, and extremely low cost. The Spotter Device is a compact, solar-powered, surface-follower, which measures surface waves and currents. Through our online Dashboard you can remotely configure your Spotters, access your data in real time, visualize wave data and position tracks. Your Spotter is already connected, so all you do is turn it on and focus on collecting the data you need.

Autonomous Underwater Vehicle (AUV, or unmanned robot submarine)
AUV are increasingly non-proprietary and made up of “plug-and-play” AUV modules which can be brought together and configured in the field. When assembled with a set of survey-grade sonar modules, a Gavia AUV becomes a self-contained survey solution with a low logistics footprint that is capable of carrying out a wide range of missions for commercial, defense, and scientific applications.

Autonomous Underwater Observatories
A submersible platform enabling tailor-made solutions with sensors from a variety of sources. They can be used off-the-shelf or customized to better cover application requirements. They have the ability for long-term deployments and the ability to collect and transmit data through a variety of methods. As an example, review the deep argo a submersible device for collecting data at extreme depths.

Others...
The list doesn't stop at what is described here. Reading and research on ocean sensors and the digitization of oceans will direct the reader toward the large number of devices collecting data in and around the ocean. Honestly, I'm amazed with the number of devices and the amount of data currently being collected about the oceans. I am increasingly of the belief the we need a reference architecture for the digitization of oceans... Actually, I'm surprised there isn't one already... for it would certainly help to bring together all the oceans data collection on a global scale for the good of us all. If you have any knowledge of an open reference architecture for the digitization of oceans please forward this information along. Thank-you!

Over the next few months I will be publishing a series of blog posts describing, in more detail, all the aspects for building a successful digitization of oceans reference architecture. Next up is; "communications" with focus on the data communications available to oceans technology. Please follow along and make comment. For a table of contents of these coming posts please review a companion post; Digitization of Oceans Reference Architecture TOC

Thursday, December 07, 2017

A plethora of end points

There is a growing number of data collection devices available to the digitization of everything (including oceans). The variety of devices and sensors includes everything from temperature through chemicals to acceleration. Combine the number of different sensors with the ability to transfer data over great distances and the ability to monitor even the most remote places for obscure data points is increasingly easy and affordable. The following list provides an overview of the types of devices and sensors available.

Internet of Things (IoT) Sensor Classification from Black Box Paradox.

  • Position / Presence / Proximity
    Presence Sensor
  1. Proximity Sensor - A proximity sensor is a sensor able to detect the presence of nearby objects without any physical contact.
  2. Position Sensor - A position sensor is any device that permits position measurement.
  3. Presence (or Occupancy) Sensor - An occupancy sensor is a motion detecting devices used to detect the presence of a person or object.
  • Motion / Velocity / Displacement
    Displacement Sensor
  1. Motion Sensor - A motion detector is a device that detects moving objects. Such a device is often integrated as a component of a system that automatically performs a task or alerts a user of motion in an area.
  2. Velocity Sensor - A velocity receiver (velocity sensor) is a sensor that responds to velocity rather than absolute position.
  3. Displacement Sensor - A displacement sensor (displacement gauge) is used to measure travel range between where an object is and a reference position. Displacement sensors can be used for dimension measurement to determine an object's height, thickness, and width in addition to travel range.
  • Temperature
    Temperature Sensor
The temperature sensor detects the current temperature or changes in temperature.  There is a large number of temperature sensors available and a comprehensive list is available on Wikipedia.


  • Humidity / Moisture
    Humidity Sensor
  1. Humidity Sensor - A humidity sensor (or hygrometer) senses, measures and reports the relative humidity in the air. It therefore measures both moisture and air temperature. 
  2. Moisture sensor - A moisture sensor is an instrument used for measuring the water vapor in the atmosphere. Sometime considered same device as humidity sensor.
  • Acoustic / Sound / Vibration
    Acoustic Sensor
  1. Acoustic Sensor - Surface acoustic wave sensors are a class of microelectromechanical systems (MEMS) which rely on the modulation of surface acoustic waves to sense a physical phenomenon.
  2. Sound sensor - Sound Sensor can detect the sound intensity of the environment. The Sound Detector is a small board that combines a microphone and some processing circuitry. It provides not only an audio output, but also a binary indication of the presence of sound, and an analog representation of it's amplitude.
  3. Vibration sensor - a vibration sensor can detect vibrations. A transducer, such as that incorporating a laser or a piezoelectric crystal, which converts vibrations into an electrical equivalent such as a voltage. Also called vibration transducer, or vibration pickup.
  • Chemical / Gas
    Gas Sensor
  1. Chemical Sensor - A chemical sensor is a self-contained analytical device that can provide information about the chemical composition of its environment, that is, a liquid or a gas phase. 
  2. Gas sensor - A gas detector is a device that detects the presence of gases in an area, often as part of a safety system. This type of equipment is used to detect a gas leak or other emissions and can interface with a control system so a process can be automatically shut down.
  • Flow
Flow Sensors monitor liquid flow rates and accumulated flow volume. Flow measurement is the quantification of bulk fluid movement and can be measured in a variety of ways.
  • Force / Load / Torque / Strain / Pressure
  1. Force Sensor - A Force Sensor is defined as a transducer that converts an input mechanical force into an electrical output signal. Force Sensors are also commonly known as Force Transducers.
  2. Load Sensor - A Load Sensor is defined as a transducer that converts an input mechanical force into an electrical output signal. Load Sensors are also commonly known as Load Transducers or Load Cells.
  3. Torque Sensor - A torque sensor, torque transducer or torque meter is a device for measuring and recording the torque on a rotating system, such as an engine, crankshaft, gearbox, transmission, rotor, a bicycle crank or cap torque tester. Static torque is relatively easy to measure.
  4. Strain Sensor - A Strain gage (sometimes referred to as a Strain Gauge) is a sensor whose resistance varies with applied force; It converts force, pressure, tension, weight, etc., into a change in electrical resistance which can then be measured.
  5. Pressure - A pressure sensor is a device for pressure measurement of gases or liquids. Pressure is an expression of the force required to stop a fluid from expanding, and is usually stated in terms of force per unit area. A pressure sensor usually acts as a transducer; it generates a signal as a function of the pressure imposed.
  • Leaks / Levels
  1. Leak Sensor - leak detection is used to determine if and in some cases where a leak has occurred in systems which contain liquids and gases.
  2. Level Sensor - Level sensors detect the level of liquids and other fluids and fluidized solids, including slurries, granular materials, and powders that exhibit an upper free surface.
  • Electric / Magnetic
  1. Electric Sensor - A current sensor is a device that detects electric current in a wire, and generates a signal proportional to that current. The generated signal could be analog voltage or current or even a digital output. The generated signal can be then used to display the measured current in an ammeter, or can be stored for further analysis in a data acquisition system, or can be used for the purpose of control.
  2. Magnetic Sensor - A MEMS magnetic field sensor is a small-scale microelectromechanical systems (MEMS) device for detecting and measuring magnetic fields (Magnetometer).
  • Acceleration / Tilt
  1. Acceleration Sensor - An accelerometer is a device that measures proper acceleration. Proper acceleration, being the acceleration (or rate of change of velocity) of a body in its own instantaneous rest frame, is not the same as coordinate acceleration, being the acceleration in a fixed coordinate system.
  2. Tilt Sensor - A clinometer or inclinometer is an instrument for measuring angles of slope (or tilt), elevation or depression of an object with respect to gravity. It is also known as a tilt indicator, tilt sensor, tilt meter, slope alert, slope gauge, gradient meter, gradiometer, level gauge, level meter, declinometer, and pitch & roll indicator.
  • Machine Vision / Optical / Ambient Light
  1. Machine Vision Sensor - As a simple concept, machine vision is the use of devices for optical non-contact sensing to automatically receive and interpret an image of a real scene in order to obtain information and/or control machines or processes. Machine vision (MV) is the technology and methods used to provide imaging-based automatic inspection and analysis for such applications as automatic inspection, process control, and robot guidance, usually in industry. Machine vision is a term encompassing a large number of technologies, software and hardware products, integrated systems, actions, methods and expertise. Machine vision as a systems engineering discipline can be considered distinct from computer vision, a form of computer science.
  2. Optical Sensor - Electro-optical sensors are electronic detectors that convert light, or a change in light, into an electronic signal.
  3. Ambient Light Sensor - A device that detects the amount of light in the vicinity. An ambient light sensor may be built into a smartphone or tablet to adjust the screen brightness based on the available light.
Security matters!
It is also very important to note that as sensors become increasingly available and controlled over the network, security should be of huge concern. Industrial controllers are becoming increasingly targeted for security vulnerabilities and if an oceans sensor is available from a remote location over a network it is potentially open to attack. Being aware of the data available from the sensor and any control features the sensor (activator, controller) may have.

The other end
When considering the digitization of oceans reference architecture and what is considered an end-point we need to also look to the other end. And by the other end, I mean the data storage end. In this post we have discussed all the end points that emit the data, and at some point the data will need to be "at rest" stored in some storage device. The topic of data storage will be discussed in a later post.

Some examples of end-points
  • Nomad - The AXYS NOMAD is a unique aluminum environmental monitoring buoy designed for deployments in extreme conditions. The NOMAD (Navy Oceanographic Meteorological Automatic Device) is a modified version of the 6m hull originally designed in the 1940’s for the U.S. Navy’s offshore data collection program. It has been operating in Canada’s Weather Buoy network for over 25 years and commonly experiences winter storms and hurricanes with wave heights approaching 20m.

  • Coast Guard Canada - it is very hard to imagine this autonomous vessel will not be loaded with sensors to collect data. Portsmouth, UK, based ASV Global has converted a 26ft hydrographic survey launch to enable it to operate autonomously using the ASView control system, while maintaining its ability to operate in a conventional manned mode. The launch, which is part of the Canadian Coast Guard’s fleet dedicated to the survey operations of the Canadian Hydrographic Service, will be used as a test platform for unmanned survey work.

  • Personal Weather Stations - A personal weather station is a set of weather measuring instruments operated by a private individual, club, association, or even business (where obtaining and distributing weather data is not a part of the entity's business operation). Personal weather stations have become more advanced and can include many different sensors to measure weather conditions. These sensors can vary between models but most measure wind speed, wind direction, outdoor and indoor temperatures, outdoor and indoor humidity, barometric pressure, rainfall, and finally UV or solar radiation.


Over the next few months I will be publishing a series of blog posts describing, in more detail, all the aspects for building a successful digitization of oceans reference architecture. Next up is; "communications" with focus on the data communications available to oceans technology. Please follow along and make comment. For a table of contents of these coming posts please review a companion post; Digitization of Oceans Reference Architecture TOC

Sunday, November 26, 2017

three posts for digitization of oceans reference architecture

The next three posts within my series describing the need for a digitization of oceans reference architecture will be focused on the three technology domains of; end-points, communications, and data stores. This separation is to allow a deeper look into each domain as they have different considerations in relation to technology architecture and attributes important to the digitization of oceans.


End Points: the sensors and devices which collect and emit data. Consider this the Internet of Things (IoT) that can be located anywhere within and around oceans, airborne, surface, and submersible.

https://en.wikipedia.org/wiki/User:Peterrawsthorne/Digitization_of_Oceans#End_Points

Communications: the communications technologies available to transfer data from one place to another. A lot to explore within this domain; as underwater data transmission is an emerging technology, and the structure of the data messages will become the foundation of the reference architecture.

https://en.wikipedia.org/wiki/User:Peterrawsthorne/Digitization_of_Oceans#Communications

Data Stores: there are many existing data storage approaches, locations, and structures that can be used to store the oceans data. Databases and database designs are already available for many of the subjects within the digitization of oceans. Better to use exiting methods to store the structured and unstructured data and use a federated approach to bring them together.

https://en.wikipedia.org/wiki/User:Peterrawsthorne/Digitization_of_Oceans#Data_Stores

Creating an inventory
The number of technologies, vendors, standards, and approaches within these three domains will be large and forever growing. To start documenting this inventory I have created a Wikipedia page under my Wikipedia account. Once this page describing the Digitization of Oceans and its Reference Architecture is more complete I will submit it as a published article otherwise consider it a work in progress. Feel free to join in and edit the work in progress wiki page;

https://en.wikipedia.org/wiki/User:Peterrawsthorne/Digitization_of_Oceans#Lists

Tuesday, November 21, 2017

What is a reference architecture?

There are many descriptions of reference architectures available on the web. Here is a list of some I consider do a good job of describing the subject while supporting the description I am working  toward in the digitization of oceans;
  1. Reference Architecture: The best of best practices - given its age (published 2002) it is still relevant and pragmatic. Though I do consider the description too dependent upon RUP, which introduces many weighty practices and misses some of the more agile and emergent approaches. Still the description gives good detail to the importance, breadth, and depth of the reference architecture. The later sections of this description provide information on creating, using, updating, and working with a reference architecture. These sections are particularly useful in developing the digitization of oceans reference architecture. I strongly believe the oceans reference architecture will be emergent as many new technologies and stakeholders contribute and become a part of developing the architecture.
    Emergent architecture is when organizational structures such as business processes and technologies are designed incrementally by many designers.
  2. Wikipedia: Reference Architecture - very short for a complex topic, but it is too the point in defining the reference architecture as templates within a subject, industry, or domain. It stresses the importance of a common vocabulary and in drawing upon successful projects within the domain. It aligns with the use of APIs which I believe will become an important part of a strong digitization of oceans reference architecture.  It also calls out a number of the benefits derived from the reference architecture.

  3. CIO Online Magazine - describes where the reference architecture fits within the EA toolkit, and looks to all the relationships among business, systems, and technology. It describes how the reference architecture can greatly assist in defining specific technical deliverables within these complex systems. Having a proven, standards based, and shared toolkit for developing the oceans reference architecture will assist in keeping the architecture team distributed throughout Atlantic Canada well aligned when creating and maintaining the reference architecture..
Some example reference architectures:
  1. Microsoft Industry Reference Architecture for Banking (MIRA-B)
  2. A Reference Architecture for The Open Banking Standard
  3. IBM Insurance Reference Architecture
  4. Healthcare Reference Architecture
These four serve as examples of reference architectures from established industries where the patterns, technologies, and architectures have developed through time. As you read through these you can get a sense of the value, industry collaboration, growth, and innovation that can be facilitated by having a comprehensive industry reference architecture.

What is unique about a digitization of oceans reference architecture?

The infrastructure of the Digital Ocean. (Courtesy of Liquid Robotics, a Boeing Company)
It's too early in this discussion to be specific about the oceans reference architecture, it is important to note that it is both broad and deep. It is broad in that it includes many ground based systems, processes, and infrastructure similar in complexity to the previously mentioned examples. In addition to this broadness we need to add the theater in which the digitization of oceans operates; we have vessels of many types (airborne, surface, and submersible), we have a growing collection of sensors and protocols, we have cross industry collaborations (fisheries, environment, oil and gas, shipping, researchers, academia, defense, etc.). I believe it is safe to say the reference architecture for the digitization of oceans will be very broad due to the number of data collection points and the number of intersections (technical and otherwise). The oceans reference architecture will also be very deep in that the data will be coming from many sources above and below the ocean surface. And the data that is being collected will both be very specific and detailed, while also being general at a more meta level. I believe it is the breadth and depth of the digitization of oceans that make its reference architecture unique. And its creation is a large, important, and emerging challenge... more on this to come.

Over the next few months I will be publishing a series of blog posts describing, in more detail, all the aspects for building a successful digitization of oceans reference architecture. Next up is; "a plethora of end points" with focus on oceans technology and the Internet of Things (IoT). Please follow along and make comment. For a table of contents of these coming posts please review a companion post; Digitization of Oceans Reference Architecture TOC

Monday, November 13, 2017

Digitization of Oceans Reference Architecture TOC

For a sense of where I am going with this series of posts describing a reference architecture for the digitization of oceans, please consider this "table of contents". As I complete items in the list, I will update this TOC;
  1. Introduction - summary of why a series of posts describing a digitization of oceans reference architecture.
  2. What is a reference architecture? - summary of the existing online descriptions of reference architecture and why it is important to building a strong technology ecosystem.
  3. A plethora of end points - with new Internet of Things (IoT) end points coming available with increasing frequency we look to how many sensor types are available to an oceans reference architecture and some examples of how they are being used. I've included a couple more posts describing end points, as I deepened my research I felt the descriptions needs to be expanded to include what is happening as end-points in the oceans and in the air.
  4. Communications - description of the current state of data communications above and below the oceans surface. And why it matters to the reference architecture.
  5. Messaging Standards - the structure of the data packages (or messages) between the endpoints is a very important attribute of a successful reference architecture.
  6. What is the digitization of oceans? - a high level description of the digitization of oceans. This will detail the breadth and depth of what is considered the digitization of oceans. This description should also consider the intersection of the different ecosystems of; business, innovation, and knowledge.
  7. What is a digitization of oceans reference architecture? - comprehensive diagram of the entities within the oceans reference architecture with detailed description of each item and their connections (digital or otherwise).
  8. The importance of good governance - the dynamic nature of innovation within the digitization of oceans will cause many elements of the reference architecture to be changing. To encourage interoperability at all levels (technical and otherwise) having good governance will be paramount for success.
  9. The economic value to be found in the oceans reference architecture - why is a reference architecture valuable for community, business, innovation, etc. And why Atlantic Canada should be a major contributor or primary steward of the reference architecture.
  10. How to create the digitization of oceans reference architecture - what is the road map in completing the first release of a digitization of oceans reference architecture. I purposely say first release as this reference architecture will need constant tending as new technologies and capabilities come available.
Keep in mind this is meant to kick off a conversation about creating a reference architecture for the digitization of oceans. This is NOT something I want to do on my own, or believe I could do effectively on my own without contributions from others and a few years of focused effort. I really want engagement across Atlantic Canada to discuss the idea of creating this reference architecture and to become stewards of the reference architecture as it is used globally for the benefit of everybody.

Disclaimer - All views expressed on this site are my own and do not represent the opinions of any entity whatsoever with which I have been, am now, or will be affiliated. They are views created by my many years as an IT professional and, more importantly, an enterprise architect responsible for building large and distributed systems.

Sunday, November 12, 2017

Reference architecture for the digitization of oceans

I strongly believe one of the cornerstones for the successful digitization of oceans is a reference architecture. I believe this also holds true for the digitization of oil and gas, but I see this digitization as a subset of the oceans. Or more specifically, I see the digitization of maritime oil and gas as a subset of the digitization of oceans. I regress... One of the most important aspects of the reference architecture is its openness (as opposed to proprietary). If we are wanting to spur innovation in Atlantic Canada we need every small, medium, and large organization to realize benefit from shared digital resources. We need a way for all these organizations to openly communicate and build this digital ecosystem. The digitization of oceans reference architecture will define (or utilize existing approaches) the "language" that all oceans technologies communicate with one another and remember their collective history.

An example; the reference architecture would specify the digital messaging structure for an ocean temperature event. Therefore, when a small startup (that specializes in ocean temperature sensors) needs to publish their data they need only comply with messaging structures for ocean temperature. This would allow everyone in the ecosystem to get at their temperature event data as soon as it is available. Also important, is the startups market for ocean temperature sensors customers includes everyone who is aligned with the oceans reference architecture and it's messaging structures.

Another example; the reference architecture would specify the underwater wireless communications approaches and suggested protocols and practices. All allowing the temperature event data to be broadcast and communicated to the historical repository for archiving.

Another example; the reference architecture would specify all the messaging structures and the data storage approaches so the data could be archived and be available through time. The historical archive would allow for retrieval, searching, research, planning, and analysis.

Over the next few months I will be publishing a series of blog posts describing, in more detail, all the aspects for building a successful digitization of oceans reference architecture. Next up is; "what is a reference architecture" with focus on oceans technology. Please follow along and make comment. For a table of contents of these coming posts please review a companion post; Digitization of Oceans Reference Architecture TOC

Disclaimer - All views expressed on this site are my own and do not represent the opinions of any entity whatsoever with which I have been, am now, or will be affiliated. They are views created by my many years as an IT professional and, more importantly, an enterprise architect responsible for building large and distributed systems.