Building an AI SaaS product on a shoestring budget with aws serverless (Part 1)

opinions

21 Jan

1. Introduction

SaaS products are everywhere: Salesforce, Google, Hubspot, LinkedIn and just about every other Social Media platform; they all use the software as a service model. But is it possible to create a SaaS product without needing VC funding with unlimited resources and be able to do it on a shoestring budget? Well, I’m here to tell you that you can and I know because I’ve done it. The saas product that we created was enterprise grade that is sold to large businesses and guess what it’s still live today generating revenue and in this post I’m going to show you how we did it.

So what are we going to be covering today? Well we’ll start off with a brief; we need to know what we’re going to create and what constraints we’re working with. Then we’ll go into the the Top 10 things that we need to take into consideration when we’re designing our solution: some will be specific to SaaS, and others will be applicable to software in general. Finally we’ll look at the infrastructure models of serverless and server, examine exactly what they mean where we get it from and how they affect our development process. To do this you’re going to step into my shoes and I’m going to show you what happened.

2. The Brief

So congratulations you are now CTO and the architect with the unenviable task of delivering the product that’s going to seal the fate of the business. The objective is to build a service that will be purchased by large businesses where each business may have multiple users all using the application, they need to be able to login securely and be able to do it according to their own security standards. The application itself should allow users to upload a dataset, configure a deep learning model and then train that model and serve predictions either through an API or in a batch process. It needs to work quickly and reliably from any location in the world, onboarding new customers should work seamlessly and should scale to any number of customers and users.

Now there’s a few things that we need to work with. Firstly, we don’t have a lot of money, so everything that we do needs to be done on a budget. It needs to be cheap to develop and cheap to run. And there’s another problem, all of our tech stack that we have worked with in the past has been built on the salesforce.com and that’s what our engineers are trained on, and unfortunately for us salesforce.com is not great for building machine learning platforms on, so we’re going to need to find an alternative.

All of a sudden any excitement that you had has been replaced with acute anxiety and a need to breath deeply into a paper bag, but fear not because hope is on the horizon.

3. Top 10 Considerations When Building a SaaS Product

You decide that the first thing we really need to do is make a list of all the things that we need to take into consideration as we’re designing and building our product. This will guide not only our design choices but also what 3rd party services we are going to use and that’s going to be key to maximising productivity and getting this thing to market with the limited resources we have. Some of these things will be common to any software development project and some will be specific to a SaaS based product like the one we’re building today. Here’s your top 10 that you come up with.

Tenancy Isolation
You imagine a scenario where you have two customers both selling robotics: “Botzilla PLC” and “RoboNope Ltd” and we inadvertently make “RoboNope Ltd’s” data accessible to “Botzilla PLC” in our app. Because you’re smart you know it’s generally frowned upon, to leak highly sensitive company data to its competitors, so you start thinking about how you’re going to keep your customer’s data separated.
GDPR / Regulatory compliance
Your product is going to be collecting and storing customer data and you know that comes with some additional responsibilities for things like GDPR. In fact after digging around with your CFO you come to the conclusion that this will make your business a “Data Controller” in GDPR speak. We’d better get the low down on what this means as the word litigation for some reason makes you somewhat nervous.
Performance
However robust you think your application is, your customers will always find a way to break it and usually within the first day. You imagine that you’ve onboarded a new customer and then you get a call. Something doesn’t work. You check the logs and you see something like an out of memory error. If you’ve designed your architecture with components intended for quick lightweight tasks to tackle long running memory intensive tasks then you’re probably going to have a problem and quite rightly you think that things like that are going to be difficult to sort out quickly. So we’d better choose our services wisely based on what they’re designed to do.
Monitoring and Observability
You know that bugs are a fact of life. If we’re smart, however, then we’d like to find a way to be made aware of technical issues and ideally before the customer is and then we need sufficient information to be able to correctly diagnose the problem. Secondly, and maybe less obviously, you realise that being able to somehow get some insights into how your customers are using your product would be really helpful input into guiding your future roadmap. We’d like to know things like how frequently are they using it and what features are used the most. Let’s give some thought to how we might do that.
Service Reliability
Back in salesforce world things like Disaster Recovery, Business Continuity and Incident Management aren’t really a thing as salesforce vendor products don’t actually store any data and if the service goes down then well that sounds like a salesforce problem not a you problem. Now you think maybe we can get away with doing the same thing with whatever platform we build on, but you suspect not and come to the horrifying conclusion that you’re going to have to get this in order. You think what’s going to happen if the service fails? Who’s going to notify the customers and by what method? Who’s going to work on the problem? How is our team going to be organised to best tackle it and get to the root cause? What is our immediate strategy to contain the issue? This is all stuff that we need to work out in advance because the last thing we want to be doing is figuring this out as we are actually facing an incident.
Security
You know that your SaaS based production system is going to be a precious thing because you have all your revenue generating capability concentrated in one place. Giving access to a junior dev or even worse that tech savy sales rep you know is going to be like giving the keys of a Ferrari to an 16 year old kid and telling him to go nuts. The results are probably going to be spectacular, messy and end up costing a lot of money. Production systems are things we don’t want to be messing around with manually. The thought of configuring a bunch of database tables with hand written scripts whilst it is in live use breaks you out into a cold sweat. So we need to figure out how we can lock down access as tightly as possible to protect it both from our own stupidity and from anyone else outside that might fancy trying to hack in to our system.
Updates new features
Inevitably it always happens that you’ve built your product to the point where it’s generally available and then you show it to a customer or a prospect and the first thing that they will say is, “That’s great but if only it did this thing and that thing then we would be delighted”. After several weeks of sleep deprivation getting the product where you could show it to the prospect on this day you’re probably going to have to suppress the overwhelming urge to resort to violence and instead smile and say “that’s really great feedback”. But, the truth is you know that we’re engineers, we’re always looking ahead at what’s coming next and so we need to find a way that will allow us to make frequent incremental changes and deploy them into the production environment in a safe and controllable way. Quite rightly you figure that’s going to be some sort of CI/CD process.
Tenant provisioning and de-provisioning
You imagine that your sales and marketing teams have gone to a conference and knowing what they’re like they offer a deal: there’s a 30 day free trial for our new product. There’s a deluge of people who sign up and all of a sudden you’ve got 50 prospects who all want access to your system. What’s it going to be like for us to provision all those prospects if it wasn’t automated? Imagine how hard it would be to have a piece of paper scrabbling around inside a production system trying to figure out for each customer how to create a new database table and how to setup authentication and user access, etc, etc. It’s an impossible situation. We don’t want to be involved in this game. We’re the product team and we want to let the Marketing team, the Sales team, the Operations team whoever it is be responsible for onboarding new customers and we want to be able to do that by giving them a single button to click and then it’s done because it’s automated.
Designing common features
If you’ve come from a consulting background or have worked in internal IT functions previously then you will be familiar with making applications that are specific to one customer’s needs. Making our SaaS product that you are going to sell to many customers is different, because what you’re doing is creating software from a common codebase that is going to shared by all of your customers. You know that not every customer is going to be completely happy with your product in its vanilla state, but maybe there is a compromise to be made. Maybe there are things that we can do that will allow a customer a certain level customisation driven by their data and metadata. It might mean that we build in the ability for customers to view custom fields or to define a workflow. Whatever it is, we should really figure it out because having that mix of common codebase with metadata driven customisation is really going to give you the flexibility to be able scale and develop your product going forwards.
The credibility of the platform that you’re building on
You realise that not everything should be governed purely by budget. There are times when you need to use things that are quality based. If you’ve gone out and built your product on top of some new startups infrastructure service, run by a kid in his bedroom to save on costs then that sounds like a recipe for trouble. Imagine that your CRO comes to you and says: “Hey I’ve got this great prospect and he wants to have a security review with you before he goes ahead and signs a contract”. So, you go to this meeting with their security officer, who’s responsibility it is to ensure that his company’s data is well looked after. Now, what it’s going to be like to tell him that his data is actually going to be living inside a shed on a 15 year old machine? He’s probably going to say that your product is a sh*tbox and will bid you good day. That sort of thing tends to reflect poorly on your decision making abilities. So let’s not be that guy.

3. Serverless to the rescue

If you’ve made it this far and I haven’t completely put you off the idea of building a SaaS product, firstly well done, and secondly let’s move on and start exploring what options we have to actually build this thing on. Now coming from a salesforce background the benefits of working in a serverless environment way have been drilled into you, but what exactly are we talking about? In salesforce, development is done at an abstracted level, meaning that you have API’s to work with that you can plug into your IDE. There is a config user interface that allows you to create new database tables, called objects, fields and even workflows without writing any code. Your whole experience is with the interface that salesforce provides and salesforce in turn take responsibility for all the infrastructure and plumbing needed to make the service run. You don’t get to control what compute resources you get or what database system you run on as that’s all done by salesforce. Now the drawback is that salesforce will enforce strict controls over what you can do, like for example, how many records you can query in one go, but on the plus side development can be very rapid and you don’t need to hire a DevOps team. It’s a trade off and the reason why I mention this is because I see salesforce as the purest version of serverless that you can get.

At the other end of the spectrum, you’re in the realm of DIY datacenter. With a 15 year old desktop sat in the corner of the room with a bunch of RJ45 cables everywhere and when the new intern unplugs the router to charge their phone the whole service goes offline and you’ve got customers screaming down the phone at you like it’s armageddon. At which point the network guy wakes up from his afternoon nap and starts crawling under desks looking for broken cables. It’s a world of pain and I know where I’d rather be.

Enter AWS. Now I know that there are other options like GCP and Azure, but let’s just assume that we’ve decided that if we’re going down the IaaS route then we’ve selected AWS. What they do is provide a set of services that fill that spectrum and today they have more than 200. Things that are closer to the DIY datacenter are on the server side, an example being EC2 which essentially enables you to provision bare metal and where you are responsible for everything that goes with that like the OS, security patches, access tables, etc, etc. And an the other end, the serverless end, you have things like Lambda which are small things that can run code logic. There’s no OS, no patches, no network it can even be configured to have its own api endpoint.

You’re thinking that our strategy should be to prioritise selecting services that are on the serverless side over things that are on the server side. That should give your team the best possible chance of maximising development productivity with the few precious resources that we have available.

So what can you do with serverless? As it turns out quite a lot: need to write a function to execute some logic? - great use AWS Lambda. Need to store some data? DynamoDB to the rescue. What’s that you need an API ? - no problem API Gateway can handle that. You get the picture for 80% of the things you need there is a serverless option.

The drawback is that you pay a premium for serverless services, but if you’re careful then for the most part they are billed by how much you use rather than a flat fee, which for you building a brand new service makes sense as when you have very few customers your costs will be low and should rise reasonably predictably as your customer base grows. As long as you’re fairly sensible in how you use those resources it really doesn’t have to cost a lot to run ( seriously in our early days we were running multiple stacks for development, testing and production for a few hundred pounds a month ).

The server to serverless spectrum. This is not a complete list it’s just for illustrative purposes

You have a strategy for how to choose the various services that will make up your saas product, but how are you actually going to create this thing? Well that will be the subject of our next post so look out for part 2.

5. How We Can Help

If you’re facing any of these challenges then we can help please contact us for an initial consultation.

Contact

serverlesssaas

Gareth Davies

Gareth is an AI researcher and technology consultant specialising in time series analysis, forecasting and deep learning for commercial applications.

https://www.neuralaspect.com