100 most promising AI startups

We’ve come a long way since forming in 2015. Starting out as a small team, we now have four offices worldwide – Lisbon, Porto, Tokyo, and Seattle – and continue to grow every day.  

Our unique platform has helped many successful companies feed their artificial intelligence applications with training data. Using human intelligence coupled with machine-learning, we deliver project-specific, quality-guaranteed data.    

Today, we’re proud to announce that DefinedCrowd is among CB Insights’ third annual list of 100 AI startups. A research team from CB Insights selected 100 startups based on the following factors: investor profile, market potential, partnerships, competitive landscape, and team strength. 

Source: CB Insights

Companies are categorized by focus area. These focus areas aren’t mutually exclusive and include core sectors such as telecommunications, government, retail, healthcare and enterprise tech sectors such as training data (where we sit), software development, data management, and cybersecurity. 

We are pleased to be among this group of incredible AI startups, selected from an extensive list of 3k+ AI companies, and look forward to seeing these companies grow.  

It´s been a great start to 2019. And, we´re very thankful to everyone who has helped get us here.  

Daniela Braga’s journey with DefinedCrowd

Earlier this week, DefinedCrowd was Featured in Jornal Económico, a premium financial publication in Portugal. We’ve translated the article from the original Portuguese for our English- speaking friends. Enjoy!

Original Article by
António Sarmento

Founded in Seattle, USA, DefinedCrowd is a startup specializing in training data for Artificial Intelligence. The company counts Amazon, IBM, and EDP as investors and clients. 

DefinedCrowd provides services so that data scientists can gather, structure, and enrich datasets for Artificial Intelligence, helping companies improve speed to market and the overall quality of their AI products. DefinedCrowd accelerates enterprise AI initiatives by combining machine learning technology with human-in-the-loop collection processes. Founded in August 2015 by entrepreneur Daniela Braga, the company is headquartered in Seattle, has R&D centers in Lisbon and Porto, and a sales office in Tokyo. 

Three months after its founding, the company opened their first R&D office at Startup Lisbon. Since then, DefinedCrowd has blossomed from an initial team of three employees to a workforce of more than 70 that is still growing.

In 2016, the company raised $ 1.1 million in seed funding, with investors such as Sony, Amazon Alexa Fund, Portugal Ventures, and Busy Angels.
In July 2018, DefinedCrowd closed a Series A funding worth $11.8 million, led by Evolution Equity Partners. EDP Ventures, Mastercard and Kibo Ventures joined as new investors, while Sony, Amazon, Portugal Ventures and Busy Angels bolstered their investments with additional capital for the data company.

“It is important to raise capital if we want to move fast, especially in the technological sector.”   

Daniela Braga to Jornal Económico

This influx of capital is being used to accelerate product development and accelerate team growth. Two-thirds of DefinedCrowd’s 70 employees work out of Portugal. The company expects to add 80 more team members by the end of 2019.

Over the past six months, DefinedCrowd has announced three partnerships: a formal designation as an Amazon Alexa Skills partner, a product integration with IBM Watson Studio; and participation as a featured vendor in Microsoft‘s co-sell program.

DefinedCrowd’s platform provides industry-agnostic data services and can support text, audio, and image annotation. The company’s clients span industries as a result: from Fintech, to Retail, Healthcare, Utilities, and the Internet of Things. Their client portfolio consists mostly of Fortune 500 companies, including BMW, MasterCard, EDP, José de Mello Saúde, SoftBank, Yahoo Japan, Randstad, and Nuance

DefinedCrowd’s goals are ambitious. The company aims to become the world’s number one AI data provider through expanding their client-base and forging new partnerships with industry leaders. 

With a degree in Portuguese Language and Literature, Daniela Braga has spent her career examining the rigorous use of language, the perfect foundation for her business. “We deal daily with data in 70 languages and dialects. Our clients need, at a minimum, native-level speakers and sometimes even require linguists or specialists in language sciences for all of them” says the entrepreneur.

After graduating with a master’s degree in applied linguistics, she went on to earn a PhD in Speech Technologies at the Faculty of Engineering at the University of Porto and taught at the University of A Coruña for two years before joining Microsoft (whom she worked for in Portugal, China and the United States).

After leaving Microsoft in 2013, Daniela moved to American company Voicebox. Simultaneously, she was invited to teach Data and Crowdsourcing for Speech Technologies at the University of Washington. It was during this time that she saw the gap between the Artificial Intelligence data scientists wanted to develop and the training data available to build it. She decided to found her own company as a result.

Waving a well-paid job goodbye, and with few personal resources, she started meeting with investors in Seattle, and quickly received an initial check: $ 200,000 in financing to start her business. A business that is now signing contracts with some of the largest companies in the world.

DefinedCrowd is in constant growth and employee numbers have been updated to reflect our current position.

The Machine Learning Lover’s Holiday Book List

In the market for some last-minute gift recommendations for a machine learning “geek?” (we use the term affectionately around here). DefinedCrowd’s got you covered with our “machine learning lover’s book list,” hand-selected by our ML Team. From the ins and outs of speech and language processing to broad-level theoretical overviews of the machine learning field, these texts cover the wide-ranging topics we discuss in our office every day. Enjoy! And happy holidays from all of us at DefinedCrowd.

Speech and Language Processing An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition by Daniel Jurafsky

Summary: In this excellent intro to speech and language processing tecnologies, Jurafsky presents an empirical approach that comprehensively covers language technology based on applying statistical and machine-learning algorithms alongside modern technologies. The book largely emphasizes scientific evaluation and practical applications.

Foundations of Statistical Natural Language Processing by Christopher D. Manning and Hinrich Schütze

Summary:
A fantastic introduction to statistical natural language processing that uses an analytical approach to cover a range of mathematical and linguistic foundations for NLP technologies. Foundations of Statistical Natural Language Processing proves further useful in presenting theoretical and algorithmic building blocks for NLP technologies.

Crowdsourcing for Speech Processing: Applications to Data Collection, Transcription and Assessment by Maxine Eskenazi, Gina-Anne Levow, Helen Meng, Gabriel Parent and David Suendermann

Summary: An essential read for anyone interested in learning more about crowdsourcing training data for speech models. Offers a comprehensive overview from the basics of setting up a task, to tips for task interfaces and methodologies for quality assessment.

Deep Learning (Adaptive Computation and Machine Learning series) by Ian Goodfellow, Yoshua Bengio, Aaron Courville and Francis Bach

Summary:
This book offers a great introduction to what many consider the “Holy Grail” of Machine Learning.
Topics covered, range from mathematical and conceptual background to deep learning techniques. The “research perspectives” that book-end chapters with specific case-studies make Deep Learning a great resource for students and software engineers alike.

Deep Learning for Computer Vison with Python by Adrian Rosebrock

Summary: For those looking to master deep learning for image recognition and classification, Deep Learning for Computer Vision offers practical walk-throughs, hands-on tutorials and a direct teaching style. Useful for both beginners and for the seasoned deep learning pro looking to brush up on the fundamentals. 

Artificial Intelligence and the Future of Jobs

The growth and development of computer programs supported by artificial intelligence has led to intense debate around regulatory difficulties and because of the technology’s potential effects on employment. Are people’s concerns in these areas warranted?

From the earliest days of civilization, man, as a single thinker on earth, sought to reduce the need for physical work by inventing tools. First came the wheel and the transportation of food over farther distances with less manpower. We have been taught to evolve by creating more with less effort. For many, this was negative in the short term. Those who were once freight carriers lost their jobs. The wheel was invented around 3000 BC. You might think I´m crazy to start a technological discussion about this historical moment, but the historical reference is useful to be made in order to demystify the discussion, and to then further analyze the data we have. 

Moving forward some years later, at the start of the industrial revolution, millions of people protested in the streets of England and the United States against the introduction of weaving machinery. On the surface, the “destruction” of jobs seemed quite high. However, in truth, these jobs were never really destroyed, but rather professionally reformed. Factories, with a drastic boost in production, were increasing the salaries of those who adapted to the machinery while simultaneously reducing their overhead costs. Both countries’ wealth grew as a result due to increases in disposable income for families, and more jobs created to support the burgeoning count weaving and spinning industry. Indeed, the number of people employed in weaving jumped from 7,900 to over 320,000 after the invention of the weaving machine.  

Now after a little history revision, let’s return to the present. 

Recently PWC, one of the world’s largest consultants, launched a global study in which they estimated that artificial intelligence in the UK will replace 20% of today’s jobs within the next 20 years. However, they also estimated that artificial intelligence will create just as many jobs as it replaces. Sectors at high risk include law, finance, insurance, drivers and white-collar workers. Areas like education, science, information, communication and computing are among those that will be most valued in the future.  

Nowadays, from the moment we wake up and look at our mobile phones, until the moment we lay down and check our Facebook feed for the last time, we’re in constant contact with artificial intelligence that gives us the kind of information that allows us to make better decisions. We need to accelerate the transformation of educational systems, adapting them to the new realities of the fourth technological revolution with a particular focus on programming disciplines. We also need to find ways to support professional training programs that respond to the demands of the labor market. 

Ultimately, there is no future in which machines will be able to replace what binds human beings: creativity, intuition and love. At the end of the day, perhaps AI will make us even more human.

On Smarter AI for a Better World

Before founding DefinedCrowd, our CEO Daniela Braga had a long career as a researcher at companies like Voicebox and Microsoft. A pioneer in speech technology, she became one of the earliest advocates for voice-enabled technology as a primary user interface (long before Alexa, Siri, and Cortana, proved her right).

Her stance was rooted in a passion for uncovering the crossroads where technology and human experience collide. Convinced that Automated Speech Recognition (ASR) could improve all of our interactions with the world at large, she strove to build the models that would make that vision into a reality.

Quickly, she butted up against technological limits. The lack of high-quality training data so critical to constructing effective models was chief among them. She founded DefinedCrowd in 2015 as a direct result of that frustration, envisioning a company that would leverage cutting-edge technology, dynamic workflows, and innovative crowd-management practices to deliver the exact data sets researchers would need to build high-performance models.

Her passion for AI as a means of enhancing human experience permeates everything we do. We take the goal seriously. As our COO, Walter Benadof, wrote so eloquently just a few months ago, it’s imperative that every practitioner in this field maintains a core set of ethics and values as they continue to develop and mature.

That word, “practitioner,” is no accident.  I use it to reinforce a concept we’ve touched on before, a “Hippocratic Oath” for AI, first proposed in Microsoft’s The Future Computedand further elaborated upon by Oren Etzioni at TechCrunch. We’ve been thinking hard about how our past and future work fits with the values stated therein and the values of our company as a whole.

We’re proud that our core competencies in Natural Language Processing, Computer Vision and Automated Speech Recognition are already making workplaces, classrooms, and ultimately the world at large more accessible, safer, and easier to navigate.

On top of that, we’re also thrilled to be in the process of developing several inspiring pilots for use cases as diverse as improving healthcare interfaces to detecting preventable natural disasters (think wildfires) before they have a chance to spread out of control.

The AI sector as a whole is just scratching the surface of how the technology we create can improve the human experience. We look forward to continuing our partnerships, and forging new ones, with companies we truly do consider as beacons of our industry. We can’t wait we work side-by-side to unlock new use cases and technologies that truly can make our world a better place.

That’s why we do what we do. It has been from the very beginning.

10 Tips For Building a Successful Chatbot

“Building a bot is easy. Building a bad bot is even easier”-  Norm Judah (CTO-Microsoft)

Intro:

Globally, businesses spend $1.3 trillion on 265 billion customer service calls every year. As a result, brands across industries are investing in chatbots as a way to save time (99% improvement in response times) and money, (30% average drop in cost-per-query resolution) while increasing customer satisfaction.

But, that holy trifecta only comes to fruition if the bot gets things right every single time. Without precision training data, models trip up on simple tasks, consumers get frustrated, and the whole thing falls apart. 

While an average company may look at chatbots simply as a means of cutting costs, industry-leaders understand that AI opens the door for entirely new and innovative products. Take banking customers, for example, who identified their top priorities in a study by CGI group as follows:

  • To be rewarded for their business
  • To be treated like a person
  • To be able to check their balance anytime they wish
  • To be provided with wealth-building advice
  • To be shown spending habits and given advice on how to save

Forward-thinking banks know that by investing in a chatbot today, they’re laying the groundwork for a technology that, down the line, will allow them to hit every single one of those customer priorities. They’re investing accordingly and according to the McKinsey Global Institute, they’re building an insurmountable advantage as a result.

With that in mind. Here are my top 10 tips for keeping a chatbot initiative on the road to long-term success:

1. Know The Story:

Intents are the fundamental building blocks of task-oriented chatbots. Think of them as the problems that your agent will need to be able to resolve. In a banking scenario, these could be anything from checking an account’s balance, to wiring money, or checking branch hours. You need to understand your customers’ needs and map them out into well-defined actions (intents). Make flowcharts that delineate every possible flow of a conversation from point A to point B. Understand how the customers intents are interlinked, and determine whether there is a logical order between them. If you don’t do this exhaustively, your bot will be thrown by even the slightest variations.

2. Get Your Entities Straight

If intents define the broad-level context that determines a chatbot’s capabilities, entities are the specific bits of information the bot will need in order to execute those actions. That means when a bot recognizes an intent, like wiring money let’s say, it also needs to know the recipient and monetary amount to be transferred (at the very least). Intents can be as complex as needed, containing both mandatory and optional entities (like source account or currency, in the money wiring scenario).

3. Divide To Conquer

Don’t expect intents to come with all their requisite entities in just one turn. People leave things out. Nobody types, “I’m looking to wire $500 from my savings account to Mike Watson.” Things like “Wire $500” are much more common. Consider what further steps your bot will need to take in order to fill in the gaps. Zoom in on those flowcharts from step 1 and, for each intent, map out all the possible entity combinations. Design the conversation flow accordingly.

4. “If I Remember Correctly …”

Your bot needs to remember things! Keep track of recent interactions (intents and entities). People tend to ask follow-up questions, and it’s a nice touch to be able to answer without the redundancy of requesting information they’ve already provided. Imagine that a customer asks for a specific bank branch address. The bot successfully responds to the intent, and then the user asks: “And when does it open?” The best chatbots will answer immediately, understanding that the conversational subject is still that same branch. Keep in mind that the same can be true of intents: A customer may ask “What are the Greenwood branch hours?” followed by “What about Capitol Hill?”

5. Know What To Do When You Don’t Know What To Do

Prepare to not understand everything your customer wants, and know how to respond accordingly. You can simply say, “Sorry, I didn’t get that,” but the best bots (like the best customer service reps) provide more useful responses, such as “I didn’t quite catch that. Do you want me to perform an online search?” Or, “I didn’t quite catch that. Do you mind asking the question a different way? Or shall I connect you to an agent?”

6. “Let’s Run It From The Top”

Even though you’ll do everything in your power to avoid it, your bot could get lost in complex conversations where customers express a high number of unique intents. That’s why users should always have the option to restart the conversation from scratch. A clean slate beats a long stream of frustrating interactions from which you won’t be able to recover.

7. Control what You Can Control

You can’t control what the customer is going to say, but you sure can control how your bot will respond. Invest in variability. Different greeting and parting phrases are a nice touch, as is addressing customers by name.

8. Quality Is Variability. Variability Is Quality.

People express the same intents and entities in a multitude of different ways. Investing in data collection that gathers comprehensive variants for how people express certain bits of information is one of the most important steps on the road to building successful virtual agents. Only then will your bot understand that “How much did I spend between November 1st and November 31st” is the same as “How much have I spent this month.”

9. Sound like a Local 

People in the Pacific Northwest might refer to their savings accounts as “rainy-day” funds, whereas customers in the deep south may prefer the term “honey-pot.” On the global scale, in the US, people like to say “checking account,” but in the UK, “main” or “current” are the more popular terms. A globalized company looking to serve a broad customer-base needs to understand how different consumer blocs speak at a granular level. That way, their bot can properly interface with every customer. Here, once again, the world’s most clever algorithm won’t save you. It’s all about the data.

10. Precision. Precision. Precision. 

To quote Google’s Peter Novig, “More data beats better algorithms, but better data beats more data.” Collecting a lot of variants and running them through intent classifiers and entity-taggers only works if that data is annotated correctly. When a customer says, “check balance,” your bot needs to understand that “check” can serve as both a noun and a verb depending on the context. Otherwise, your costumers will be ramming their head against the wall with something as simple as checking the balance of their savings account. All the data in the world does you no good if it’s improperly annotated.

An Interview With Dinheiro Vivo

At last week’s Web Summit, we were lucky enough to sit down with Dinheiro Vivo, a leading financial publication in Portugal. Our conversation touched on everything from the quality-focused approach to training data to AI use-cases across industries. Watch the full interview here (in Portuguese), or check out the English transcription below:

Dinheiro Vivo [DV] – For those who still do not know your work, what does DefinedCrowd do?

Daniela Braga [DB] – We are a data collection and cleaning platform for Machine Learning and for Artificial Intelligence. Our platform combines crowdsourcing with machine learning. A mixture of people and machines working at the same time.

Artificial Intelligence is the imitation of a human brain, artificially. To develop our own intelligence, we go to school, we read many books. It takes a lifetime for a person to be able to react and make decisions in their daily lives. The way machines learn is similar. But with the computational capacity that is currently possible using the cloud and our platform, in just 3 months we can combine thousands of human brains in the same computational memory.

Our platform combines crowdsourcing with machine learning. A mixture of people and machines working at the same time.

DV – And then (the data) is used in applications that we use every day?

DB– Namely Apple Siri, Google Assistant, Alexa. Self-driving cars and even more industrial applications like machines that are doing quality control instead of having people doing it. Or at airports with automatic flight controllers.

DV – And this year was a dazzling year, an investment round, new partners, an office in Tokyo. What have you done to achieve this success?

DB – Especially in the United States, our largest market is still the United States, followed by Japan and followed by Europe. Clients are more open, they’re investing in Artificial Intelligence.

There are many companies in machine learning but there is practically no one doing data cleaning and treatment like us. It’s like the spades and pickaxes of the gold rush. We are making the shovels and picks of AI, of modern times.

DV- And coming to Web Summit also makes all the difference, right?

DB– Our growth milestones have been basically aligned with those of Web Summit. We have been here since 2016, which was the first year Portugal hosted Web Summit. We had closed our seed series. Last year (2017) we basically met the group of investors to close the series A. And this year we are here to be on the list of the top 10 AI companies in the world.

DV – And finally, when you leave Web Summit, what do you expect to take with you?

DB – This year it’s basically a visibility and recruitment maneuver- we are in an aggressive recruitment phase. We want to demonstrate that this is really the best place to work in Portugal. We’re also looking to continue developing partnerships, and solidify go-to-market strategy. Next year, I would like for 50% of our revenue to come from partnerships.