Siri vs Cortana vs Google Now – how easy are they to implement?

As developers, we’re always looking to use the right tool for the job. Virtual Personal Assistance (VPA) is becoming a thing now, well, I say “now” but they have been around for years and they don’t seem to be going anywhere. Apple, Google and Microsoft have all made a play in this field and have come up with some pretty amazing VPAs. But which one is easier for developers to implement? And which one provides the most functionality for developers to take advantage of?

I set out to develop three applications using each one of the VPAs: an Android application, an iOS application and a Windows 10 application. All three apps have the same functionality, they all allow you to interact with the VPA to find out more information through your app. The app I planned to use was a to-do list because I think it allows me to exercise a wide range of features.

Proposed Tests:

  • Test 1: Calling the VPA inside of the application to do a task for you
  • Test 2: Calling the VPA outside of the application as a background task, to again complete a task for you
  • Test 3: Calling the VPA to open your application

Here’s what I found out:

Siri

Nada, Zip, Zilch, Zero. Apple doesn’t provide a public API for Siri. Not sure what more I can add to this as I can’t test any developer scenario with Siri, although I have a feeling Apple will eventually open the Siri APIs to the public in order to compete with Google and Microsoft.

However, Siri does let you open first and third-party applications without having to write any code to enable this feature which is pretty cool. So despite not having a developer API right now, it does save developers the trouble of having to write the code to launch their application via Siri – this definitely passes Test 3!

Google Now

Google is currently developing an API to allow developers to integrate their content into Google Now. It is not currently public but some developers have been selected and given tools to integrate their apps with Google Now.

Because of this, I can’t test out the developer scenarios for Tests 1 and 2. However, Test 3 can be implemented. Here’s how you can allow your application to be launched via Google Now:

  • Create a new folder called ‘xml’ within the /res folder
  • Create a new xml file named “searchable.xml”
  • Add the following code to the xml file, where label is set to the name of your application (or the phrase you would like the user to use to launch your application)
<searchable xmlns:android="http://schemas.android.com/apk/res/android"
       android:label="To Do App" >
</searchable>
  • Now in the android manifest add a searchable intent and metadata to your main activity:
<activity
    android:name="com.authorwjf.myvoiceactivatedapp.MainActivity"
    android:label="@string/app_name" >
    <intent-filter>
        <action android:name="android.intent.action.SEARCH" />
    </intent-filter>
    <meta-data android:name="android.app.searchable" android:resource="@xml/searchable"/>
    <intent-filter>
        <action android:name="android.intent.action.MAIN" />
        <category android:name="android.intent.category.LAUNCHER" />
    </intent-filter>
</activity>

And that’s it! I don’t know about you but that’s super easy to do, it’s hardly any extra lines of code. This passes Test 3.

Cortana

Having used Cortana integration before, I know that I can do all of the proposed tests, however this isn’t the point of this article. I want to find out which is easier to implement for developers, and not whether they can pass the proposed tests. So I’m not sure I can now rate how hard or easy it is to implement when there is nothing to compare to.

Like Siri, you don’t have to write your own code to use Cortana to open up your application. But you do have the option to write your own code  to customise the way users start your application via the VPA.

Since I couldn’t test the first two tests with Siri and Google Now, I’m going to leave them out until there are more VPA APIs to compare those features too. If you’d like to know how to implement the first two tests with Cortana, I’ll be posting a tutorial about this soon so keep an eye out!

As for customising your own way of launching an application via Cortana, here are the steps needed to achieve this:

  • Create a new xml file called ‘VoiceCommands.xml’
  • Add the following code:
<?xml version="1.0" encoding="utf-8" ?>

<VoiceCommands xmlns="http://schemas.microsoft.com/voicecommands/1.2">
  <CommandSet xml:lang="en-gb" Name="HoLCommandSet_en-gb">

    <CommandPrefix>To Do</CommandPrefix>
     <Example> Launch </Example>

    <Command Name="LaunchApp">
      <Example>launch</Example>
      <ListenFor>launch</ListenFor>
      <Feedback>Opening your To-Do app</Feedback>
      <Navigate Target="MainPage.xaml"/>
    </Command>

  </CommandSet>
</VoiceCommands>

  • Now in the App.xaml.cs file, add the following code to register the voice command:
protected override void OnActivated(IActivatedEventArgs args)
{
    if (args.Kind != ActivationKind.VoiceCommand)
    {
        VoiceCommandActivatedEventArgs commandArgs = args as VoiceCommandActivatedEventArgs;
        var speechRecognitionResult = commandArgs.Result;

        // Get the name of the voice command and the text spoken.
        var voiceCommandName = speechRecognitionResult.RulePath[0];
        var textSpoken = speechRecognitionResult.Text;

        switch (voiceCommandName)
        {
            case "LaunchApp":
                rootFrame.Navigate(typeof(MainPage), commandArgs.Result);
                break;

            default:
                rootFrame.Navigate(typeof(MainPage));
                break;
        }
    }
}
  • Also add the following method in order to save the command set and call it in the OnLaunched method (inside App.xaml.cs):
private async void SaveVoiceCommands()
{
    var storageFile = await Windows.Storage.StorageFile.GetFileFromApplicationUriAsync(
            new Uri("ms-appx:///VoiceCommands.xml"));

    await VoiceCommandDefinitionManager.InstallCommandDefinitionsFromStorageFileAsync(storageFile);
}

And that’s it, now you can open your application via Cortana by saying “launch To Do”. Again this isn’t hard at all, an xml page and some lines of code to register the voice commands, so it passes Test 3 but I would say this requires more work than the Android implementation.

Conclusion

Tests 1 and 2: Cortana is the only Virtual Personal Assistance able to complete these tasks, for now. Since I can’t compare the developer implementation of this against Siri and Google Now, we will have to leave the question of which is easiest to implement unanswered.

Test 3: All three VPAs allow for this feature so they all pass the test! However when it comes to developer implementation, Siri and Cortana don’t require you to write any code so you could say they are both winners here. However Cortana does have a developer implementation option but Siri doesn’t, so because of that it doesn’t really qualify for the question of which is easiest to implement? So I’m not sure what to do here! Thoughts?

Google Now does require some code to be written and it’s easy to implement. Cortana doesn’t need code to pass Test 3 but it does provide the developer with the option to customise the way user launch their app with Cortana. So judging on which is easier to develop: Google Now needs fewer lines of code and is really is to understand, and I would say it is even easier to implement than with Cortana.

So in order of easiest to use (if we include not having to write any code into the mix):

  1. Siri & Cortana
  2. Google Now

But purely from a developer implementation view (and what this blog is really about!):

  1. Google Now
  2. Cortana

Final Thoughts

I’m glad I set out to look into the Virtual Personal Assistance space because it has truly surprised me. Starting out this project, I was sure that all of the major providers had a VPA that could integrate with apps on its ecosystem, and so I was expecting to do some development and find out which one was easiest to implement across a range of scenarios.

Unfortunately I can’t answer the questions that started off this article. It would be unfair to crown Cortana the winner here when there is nothing to compare it to for 2/3 tests. And although Cortana is great, it’s not super easy to implement, depending on the scenario of course i.e. background tasks take more effort to do than foreground tasks and so on. As for the second question of which out of these 3 VPAs provides more features for developers to implement, Cortana provides the most features right now.

So for now, I’m going to put this project in hiatus and hopefully pick it up again when more Virtual Personal Assistance APIs are publicly available for developer use. If you have any thoughts on some available VPA APIs that are not from Google, Apple or Microsoft – let me know and I’d love to try them out!

Next up, maybe iOS vs Android vs Windows app speech recognition?