One of the things I aim to highlight in this post is how to carefully examine and understand a problem before applying cutting edge technology to solve it. In this post, we will first look at a problem from the end-user’s perspective, establish and understand the user story, then propose the technology to solve it.
As a busy consultant who visits lots of clients, there’s a lot information that I come across which I want to note. However, I don’t have the time to write (or type) them all down, typically I would make paper notes, which I end up misplacing from time to time. Additionally, in my pile of my paper notes, I often find it difficult to find the information I need.
Reading the consultant’s story, it’s clear that typing on the keyboard is just not preferred. That’s a fair requirement and we have a problem that we need to solve. One of the first technologies that comes to mind is natural language processing, where the consultant’s spoken words are transcribed to text, I mean let’s think about it? To make quick notes without typing, one could just make voice memos, however being able to search through the contents of the note is also a requirement. Sure, for every memo, we can add a timestamp and get the notes for a given day or between time periods, but we won’t be able to search through the contents of the notes! Therefore, transcribing what the consultant says to text would solve the “don’t want to type” problem. Having that information in plain text, also facilitates the ability to search through it. Remember “Additionally, in my pile of my paper notes, I often find it difficult to find the information I need” is also a part of the problem.
Building a Natural language processing based solution is huge effort, so in this post, we will aim to establish a base framework for solutions using this technology i.e. build the core foundation.
p.s. the consultant is just one problem, think about it, we can also use the same framework to build an app for non-native English speakers who want to get better at speaking English. Hence it’s a good idea to lay the framework on which future products can be built.
A mobile app which when launched starts transcribing (converting audio to text) whatever the user says.
After user is done, the app should present an opportunity to edit the contents transcribed.
Persist the user’s logs i.e. save them on disk, cloud etc with the time it was created.
The user’s should be able to browse through a list of all the previous logs in chronological order i.e. sort them by date created.
One of the things we want to do, is provide users with “continuity” – the ability to access their logs from anywhere, establish a means to accomplish this. Think about this? the consultant may forget their mobile at a client-site and may need the information off-site.
Technical Requirements (iOS)
For the home screen use a few labels, buttons and a text area to show the user’s speech as it’s transcribed to text.
For transcribing spend some time researching the natural language processing (NLP) techniques available for iOS e.g. libraries such as CoreML or SiritKit.
Text editing shortcuts: few buttons to delete the last word or lines from the log.
Store the user’s logs in UserDefaults for local storage.
Show the list of past logs in a UITableView with each cell containing a UITextView, make sure that UITextView in each cell can be edited. This should possible! If not then discuss.
Use Firebase authentication and database API to store user logs in the cloud.
Release requirements (iOS App Store)
A 1024×1024 icon for to display for App Store release.
An app store description to describe the app.
A selection of keywords for ASO.
A set of screenshots i.e. 4 or 5.
A video that shows how the app works.
Another example that shows the process of how we/I try to solve problems with technology for potential users at My Day To-Do. The message conveyed here is, when presented with a problem, carefully consider the problem and evaluate whether or not some bleeding edge technology best solves that problem! Think of the value it will provide to the user. There’s no point in trying to force some technology onto a problem.
That sums up the problem and the app I am building to solve this problem is almost finished and I do hope to release it sometime next week.
As usual, if you find any of my posts useful support us by buying or even trying one of our products and leave us a review on the app store.