Critical Technology

Tuesday, May 14, 2024

Blogging in a time of AI

I am renewing my frequency in blog posting. This will come after an almost 5 year break from blogging. I am returning because I am back to working on open-source software, educational projects, and the digitization of oceans. And I know I have a learning and success journey to share.

Some Background

I started blogging twenty years back. Yes, 20 years. I was an early adopter and I was involved with technology startups and how the internet was influencing education. At this time I was a big believer that blogging was about content creation. Adding to the collective of the internet by adding meaningful and descriptive content rather than only being a consumer of content. To date, I have published over 500 posts to my assorted blogs. Most of this work was in the first 10 years of my blogging. I essentially posted once a month for the first 10 years. I did have a year where I posted over 100 times. As a summary this is how I posted over the last 20 years.

2004 to 2014 - I published 420 blog posts with good readership. I had a year where I posted more than 100 times. I set this as a goal / experiment to see if I could post twice a week. I did this while holding full-time work, which meant many early mornings and late evenings writing. I learned a whole bunch and my writing skills improved.

The themes for these first ten years was mostly;

technology leadership for startups,
hard-core technology and approaches,
innovative and emerging education,
and the intersection of these three.

2015 to 2024 - I published only 41 blog posts during this 10 years, and nothing in the past three years. Honestly, I was distracted by raising my family of three and doing a whole bunch of life living. Not so focused on work and career advancement.

The themes for the last ten years have mostly been;

integrating with technology community in St. John's NL (I moved),
continue my work on digital badges and micro-credentials,
development of an ocean data startup (still a work in progress),
and working the idea of a reference architecture for the digitization of oceans.

Most exciting of all this is more than 1/2 million views during the 20 years and at some points having over 2500 weekly views. What have I learned from all this blogging? Mostly, that having to write and publish openly to the internet helps the overall community knowledge and it helps me learn more deeply in these chosen subjects.

Next Steps

Again, I will use blogging as my cognitive gymnasium. My subjects themes haven't changed and I will focus upon two main subject areas and continue with updates to this critical technology blog;

Education technology, Heutagogy, and the self-directed learner.

The Happy Heutagogue - https://heutagogue.blogspot.com/

Many things related to, and in support of, the authoring a reference architecture for the digitization of oceans.

The Ocean Set CTO - https://oceanset.blogspot.com/

And my continued musings about technology through my gen X view of the world.

Critical Technology - https://criticaltechnology.blogspot.com/

With all my work and R&D efforts I will learn a bunch of stuff and apply this to the real world through the successful projects I am a part of. I will reflect upon these successes and all that I have learned I will create content that can provide further learning for those around me. And hopefully they will also be entertaining reads.

Collaborating with your AI partner

Blogging has changed for me. There has been a lot of technical and social change since I did most of my blog posting over a decade ago. I had a few focused subjects I was very passionate about, and I wrote about them often. I wrote unincumbered for I would consider myself an early adopter and there was less people publishing to the internet in my chosen subjects. Today there is much more content covering these subjects. And all this content comes in photos, videos, audio, and written articles. Artificial Intelligence is doing a great job of creating and summarizing the content which addresses the complex audience needs and their questions and prompts. Content creation has changed. For a human content creator I believe our work needs to be more intelligent, critical, and creative. Content creators in a time of AI need to do what the AI cannot; daydream, reflect on unrelated subjects, see unlikely connections, be critical, add meaning, create new content that falls between the generated content, fact check and confirm, and add more human intelligence.

How will my blog writing process change? Um, it already has...

I must reflect and draw upon my mastery and do my best to add the new content that AI cannot... AI needs our creativity because it has already parsed the published body of human knowledge. For more insight on my approach, use your favorite large language model chatbot (ChatGPT, Gemini) with the prompt 'limitations of generative AI' followed by the prompt 'How would you suggest a human writer overcome these AI limitations'.

Step by Step my blogging will partner with AI and follow this basic approach;

Capture ideas for new posts, be verbose, be imaginative, think about context
Put these ideas to incomplete blog posts, work ideas for days, for weeks...
Read extensively, add to the understanding of any specific idea
Keep references, cut and paste to the bottom of the related incomplete posts
Prompt AI with phrases from the idea generation
Take blocks of text from written ideas and push them into generative AI, be critical, harvest what you can.
Take the written blog post and ask AI for a rewrite. Change your audience. be critical, harvest what you can.
Try and see, try and write, what AI cannot... add to the body of human knowledge.
Add story telling to improve the overall post
Find pictures to support the writing, format for engagement. Use AI to generate images from passages of text taken from the blog post.
Format, edit, improve, repeat. Be bold... Publish.
Use AI to improve the quality of the writing... Publish again.
Rest, reflect, improve... Publish again.
Yes, I am an advocate to publish before writing is perfect. Publish and then make improvements over the days and weeks that follow. Once the post is considered finished finished... promote it on social media.
Identify what is most important about the post and rewrite for the LinkedIn business audience. Publish to LinkedIn.
Repeat...

Suggested reading: https://www.weforum.org/agenda/2023/02/ai-can-catalyze-and-inhibit-your-creativity-here-is-how/

Friday, April 09, 2021

The Leanstack Way

The Oceans of Data Lab is honored to be a part of PropelICT's startup accelerator. We had our kickoff meeting a few days ago and the current focus is on learning the Leanstack methodology and using lean canvas to tease out ALL the important details for success. The inline supporting learning modules that are available through leanstack are very helpful.

The Lean Canvas - numbers indicate the order of completion

I feel fortunate that I have been familiar with Agile and Lean approaches for over 20 years. I've got two favorite sayings I use when running software teams, and I like to think I run many aspects of my life with an Agile / Lean mindset.

Ship and ship often (deliver new releases as often and frequently as possible)
Fail and fail often (take risks, innovate, don't apologize, keep moving), success comes from failure.

For me the use of Lean in startups all began with Eric Ries when I watched a YouTube interview of Eric conducted a decade ago. This interview became a part of a 2011 blog post where I describe lean approaches within the Director of Technology role. Since this time I have revisited the works of Eric Ries every few years, he has a lot of useful insights to lean startups. One of my all time favorites in the talk he gave at Google 10 years back.

Google Talk: Eric Ries and the Lean Startup

Thursday, April 08, 2021

ODL Newsletter - March 2021

The Oceans of Data Lab (ODL) monthly newsletter is also finding its footing. It is still going to include monthly updates to the progress we make AND it will start with a few articles of interest within the data labs technology world. I am discovering so many interesting technologies and approaches within the data realm. I'm going to fold my 30 years of data experience into why I believe these are of interest to those working with large amounts of data.

Apache Data Lab

The Apache data lab that comes from the same organization that has brought us so many of the important technologies over the years. And specifically, to think of all the big data technologies they have delivered in recent years... there are just to many to list. What I like most about the data lab is its ability to be deployed to the big three cloud hosting environments. Super smart given the storage and compute requirements for data projects shouldn't be the responsibility of Apache.

DataOps and the DataKitchen

DataOps is a fairly recent concept / term that is about seven years old... and it makes sense that it becomes a discipline in itself as it is not DevOps for Data, it is so much more. The DataKitchen looks to be doing some amazing work in this capacity and have published a good read to help get your head into this important and emerging technology space.

I'm another 3 people into working towards my 100 conversations. It is said that you need to have 100 conversations as you solidify your business / startup idea. So I managed to get another three conversations in. I know this isn't that many, but that's ok as this month was more about setting up technology and thinking about risk, revenue, and the escalator pitch for the startup. I still need to talk with people, and I need peoples help, always. If you know anyone who works with analyzing data or works for a business that has a growing interest in their data, I'd love to talk with them.

What has changed this month?

Over this month my thinking has broadened and become more focused on the needs of organizations and their data. No real pivot, but clarifying what the business will be. The changes fell into three main themes;

A broader interest in helping people with their data. The backstory to my career has always been the information technology around the data. For 30 years I have focused on managing, moving, and building software for the data. This will continue with the data lab. We are still interested in ocean data and a reference architecture for the digitization of oceans, these subjects will become part of the bigger data lab.
It's a Data Lab. It became very clear this month that what I was wanting to do is stand up and run a data lab. I had a great conversation with Graham Truax at Innovation Island and this identified the alignment with my accelerator pitch and the data lab concept. After I re-read my proposal (and subsequent acceptance) to the PropelICT accelerator I confirmed... the startup is focused on creating a data lab with related products and services.
Start with a services focus, rather than product. We need revenue and the data lab is not a small product with a near MVP that can generate revenue. There are a number of MVP's that could bring business value for our customers, but nothing with significant revenue possibility. So our focus needs to be on services where we can leverage the skills and knowledge of the founder and identify projects that align well with the overall vision for Oceans of Data Lab.

It's been a business and technology focused month

This really was a more technology focused month. It was getting all the infrastructure in place to have the lab, fetch some data, and display a basic analytics dashboard. So while setting things up, we weren't that focused on reaching out to potential customers.

Setting up servers
Installing, configuring, and securing analytics software (ELK stack)
Identifying and registering domain names, setting up websites.

What are the risks and assumptions?

We also thought about what are the business risks and what assumptions are we making that could work against our success. We are not going to get into these in detail, writing them down and publishing them helps attract attention and hopefully getting the feedback we need to reduce the risk and prove or disprove the assumptions. We are also focused on what can be a product rather than what is a service.

Assumptions

Companies / Organizations will participate in a publish - subscribe business model for data sets
The data lab concept for preparing data sets for publishing will become accepted by SMB

Risks

MVP doesn't generate enough revenue or provide business value
Primary founder having knowledge, energy, or bandwidth to keep up the pace
Finding skilled employees with deep understanding of data engineering
High cost of cloud based infrastructure

Where is the revenue?

The transactional costs in the publish and subscribe (every data set transaction earns money)

this is definitely my riskiest assumption

SMB pay for services in preparing the data sets for themselves and the marketplace.

does the rise of the data engineer role show a willingness to pay for data preparation

What do we consider our Escalator Pitch?

These are early times and we don't yet have a story to tell. Gak! The escalator pitch is hard, and we really don't know what we are doing when it comes to an escalator pitch.

We help SMB realize new revenue possibilities from their existing data.
We reduce the cost of data preparation for their internal analysis and business intelligence.
We provide the services and technology to help you make sense of all your data.
We make it easy for you to see the value and opportunities based upon your unique business data.

Next Steps:

We need to focus on the customer. We need to find the customers and talk to them.
We need to reduce our risk and prove, or disprove, our assumptions.
We need a technical platform to host an Minimum Viable Product (MVP). I need to identify and prioritize a few MVP's.

If you find the Data Lab an interesting idea or have the need to bring greater value to your existing data, please feel free to contact me. We are building a business and we want to help you bring greater value from your data.

Tuesday, March 23, 2021

It's Alive! The Elastic Stack as our Data Lab

So much technical work, so little time! I finished my first three sprints toward standing up the data lab. Standing up infrastructure from scratch so you have clean new compute power is fun, and also a lot of work. Particularly when you include; doing it right, taking no short cuts, and making sure it is secure.

Sprint 0: Setup Ubuntu 20.04 Server with ELK stack.

This was mostly rehydrating virtual server infrastructure I hadn't used in 8.5 years. It needed an upgrade from all perspectives and had a completely new OS. I implemented the ELK stack and made a couple of security changes to lock it all down. I ran a few tests by setting up a couple of websites, getting the JSON confirmation from ElasticSearch, and called up the Kibana dashboard. Oooo... sweet success!

Sprint 1: Vulnerability Assessment. Security changes if required.

This evening I spent some time poking at the overall vulnerability of the server and with the ElasticSearch and Kibana services. I made a few additions and changes for further locking down the services and believe they are as secure as they can be for this first release. Very happy to feel reasonably confident about it's being locked down. Maybe, I'll get lucky and get some free PEN testing. ha.

Sprint 2: Identify and register some well aligned domain names.

I registered the following domain names, even considered buying one... it would have been too expensive. I'll implement the data lab on the oceansofdatalab.com site when it becomes closer to being a minimal viable product (MVP).

oceansofdatalab.com
oceansofdatalab.org
oceansofdatalab.net
oceansofdata.net
sevenseasofdata.com
sevenseasofdata.org

Thursday, March 18, 2021

ODE Data Lab has its technology footing

This month has become more about standing up technology than it has been talking to people about their ocean data needs. That's ok... if you are building a technology company, you need to build technology. Ocean of Data Endeavours (ODE) is about building and utilizing software towards making it super easy to work with data, large amounts of data.

The last 10 days have been about refreshing a server infrastructure I stood up 12 years ago for a number of other projects. What was left was a couple of simple websites, some domain hosting, and all the related mail server infrastructure. All of this needed a complete refresh to be brought up to date;

Rebuild the server infrastructure to have more horsepower. - DONE
Upgrade the Ubuntu OS from 10.04 (Lucid Lynx) to 20.04 (Focal Fossa). - DONE
Rework all the domain aliases to remove dependency on a domain I no longer owned. - DONE
Do some basic security work to the server. Mostly SSH focused. - DONE
Create a new mail server, and do some mailbox maintenance. - DONE
Install Apache2 httpd host. - DONE
Configure Apache2 for a couple of web sites. - DONE
Code some basic HTML to confirm the sites are working. - DONE
Celebrate! http://endeavours.com/

So good to have all this done. The server will provide a strong foundation and is well prepared for the ELK stack and the first load of ocean data. So excited!

Tuesday, March 16, 2021

An Important difference between DevOps and DataOps

Where DevOps is automation, technology, and delivery focused; DataOps is more customer focused. I like these descriptions from Wikipedia for DevOps and DataOps;

DevOps is a set of practices that combines software development (Dev) and IT operations (Ops). It aims to shorten the systems development life cycle and provide continuous delivery with high software quality. https://en.wikipedia.org/wiki/DevOps

DataOps is an automated, process-oriented methodology, used by analytic and data teams, to improve the quality and reduce the cycle time of data analytics. While DataOps began as a set of best practices, it has now matured to become a new and independent approach to data analytics. DataOps applies to the entire data lifecycle from data preparation to reporting, and recognizes the interconnected nature of the data analytics team and information technology operations. https://en.wikipedia.org/wiki/DataOps

The similarities between these two are many, particularly from a process and automation perspective. I see DevOps really focused on delivering quality software, and DataOps focused on delivering visualized data analytics to the customer.

Customer focused DataOps assists with Agility

Having a customer focused data analytics team fits well with an Agile approach. The data analytics team needs very involved customer analysts (or product owners). The customer analyst identifies the KPI's, models, or intelligences that need to be fulfilled. These become part of the backlog, and as new sprints are defined they become focused on the item(s) of analytic. A sprint can be built around a few analytics, then iterate around the items for a DataOps sprint;

Where is the data? How do we get at it?
How do we best move it? How often? What are the security or privacy issues?
What needs to be cleansed or transformed? Is the data at the correct granularity?
Do we already have any related data to improve the intelligence? Is this a new build or do we use / alter an existing pipeline?
What models or analytics do we apply?
How do we best visualize the data?

Not to say that DevOps can't fit well within Agile approaches, it can.... the backlog is more technically focused and fits into the sprint more from a continuous perspective than a customer perspective. (What DevOps features go into a sprint are often negotiated with the product owner). The focus of DataOps is in shipping features that fulfill a visualized analytic or more... The focus of DevOps is in CD / CI...

https://medium.com/data-ops/dataops-is-not-just-devops-for-data-6e03083157b7

This approach worked well for us when working on a Business Intelligence project and our nine week sprints usually focused around 4 to 9 KPI's. The organization was in aerospace, they had many legacy data sources with new data sources coming online. As with many organizations, they were in a state of improvement and transformation. Fitting new cubes, representing KPI's, into sprints allowed us to show progress and success. The biggest challenge wasn't in the technical or delivery side of getting the data to the customer. The challenge came in developing a data team where every team member understood the process end-to-end and the efforts required during each step of the DataOps pipeline. Acquiring, cleansing, and transforming data takes as much effort and understanding as visualizing the data for the customer.

Saturday, March 13, 2021

For Contract Database Administrator

Do you require contracted database administration? Medium to small organizations using database technology to store corporate data definitely need database administration to care and feed for there database technologies. This care and feeding includes, and is not limited to;

Install and maintain database servers.
Optimize database security.
Build and maintain ETL pipelines.
Performance tuning of databases.
Storage optimization for databases.
Implement DataOps for up-to-date business analytics and its need for continuous data.
Install, upgrade, and manage database applications.
Create automation, and schedule, repeating database tasks.
Ensure recoverability of database systems.

If you require any, or all, of these database administration task we can help. With over 30 years of database experience complimented with a technology degree in database management we can work remotely to keep your databases healthy and reduce business risk. Part-time or full-time, reasonable rates.