User Tools

Site Tools


music_player_skill

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
music_player_skill [2016/12/29 21:20] – Created some text portions of tutorial santiagoricoymusic_player_skill [2017/01/02 00:38] (current) santiagoricoy
Line 1: Line 1:
 ====== Creating an Alexa Skill to Play a Music File ====== ====== Creating an Alexa Skill to Play a Music File ======
 +
 +
 +{{ youtube>Yco-tKQ4k1k?large }} 
 +
 +
 +
 +
 <!-- Replace the above line with the name of your "How To" Tutorial e.g. How to Laser cut Your Name in Wood --> <!-- Replace the above line with the name of your "How To" Tutorial e.g. How to Laser cut Your Name in Wood -->
  
Line 10: Line 17:
 **Keywords:** alexa skill tutorial voice audio playback alexa sdk **Keywords:** alexa skill tutorial voice audio playback alexa sdk
 \\ \\
 +
 +----
  
 <!-- Add a representative photo of your tutorial below.  Make it centered in the page --> <!-- Add a representative photo of your tutorial below.  Make it centered in the page -->
Line 15: Line 24:
 {{ sricoy:alexatesting.jpg?500 |}} {{ sricoy:alexatesting.jpg?500 |}}
 \\ \\
-This picture shows part of the Amazon Developer console, which allows you to use many items offered by Amazon to developers; in this case, testing the voice-controlled side of an Alexa skill. We need to understand how to stream files through our Alexa-enabled device. Solving this is to some degree is important because it demonstrates the ability to fetch files from the web with the Alexa Voice Service (AVS), which implies it can be used to manipulate and control other items in the cloud, and connected devices. This tutorial shows you how to set up a lambda function that talks to the voice-operated end of an Alexa skill.  It takes approximately 1.5 hours to complete. +By the end of this tutorial, you'll have a working Alexa Skill that plays music, which can be run by speaking a predefined phrase. 
 + 
 +This picture shows part of the Amazon Developer console, which allows you to use many items offered by Amazon to developers; in this case, testing the voice-controlled side of an Alexa skill. We need to understand how to stream files through our Alexa-enabled device. Solving this some degree is important because it demonstrates the ability to fetch files from the web with the Alexa Voice Service (AVS), which implies it can be used to manipulate and control other items in the cloud, and connected devices. This tutorial shows you how to set up a lambda function that talks to the voice-operated end of an Alexa skill.  It takes approximately 40 minutes to complete.  
 + 
 + 
 + 
 +---- 
 \\ \\
 ===== Motivation and Audience ===== ===== Motivation and Audience =====
Line 39: Line 55:
   * [[music_player_skill#Programming | Programming]]   * [[music_player_skill#Programming | Programming]]
   * [[music_player_skill#Final Words | Final Words]]   * [[music_player_skill#Final Words | Final Words]]
 +
 +
 +----
 +
  
 ==== Required Items ==== ==== Required Items ====
Line 44: Line 64:
 To complete this tutorial, you'll need the following items: To complete this tutorial, you'll need the following items:
    
-  * The files within this zip file **Coming soon**+  * The contents of this git repository: [[https://github.com/santidingo/alexa_Music_Example_DASL|https://github.com/santidingo/alexa_Music_Example_DASL]]
   * Computer with Internet access   * Computer with Internet access
-  * Wifi internet access (if using an Amazon device) +  * Wireless internet access (if using an Amazon device) 
-  * Node.js runtime installed +  * Node.js runtime installed (get it [[https://nodejs.org/en/|here]]) 
-  * Text editor (Atom was used for this)+  * Text editor ([[https://atom.io/|]] was used for this, but others will work)
   * Optionally: An Alexa-enabled device   * Optionally: An Alexa-enabled device
  
-**NOTE**: The Echo Dot is what was used for the creation of this tutorial (the most convenient option, if not the least expensive). Using a physical device is optional, because skills can be tested without a physical device. However, since the testing environment gives no audio output, we would not be able to confirm that music is actually being streamed, let alone that the correct file is being played, without a device.+**NOTE**: The Echo Dot is what was used for the creation of this tutorial (the most convenient option, if not also the least expensive). Using a physical device is optional, because skills can be tested without a physical device. However, since the testing environment gives no audio output, we would not be able to hear that music is actually being streamed, without a device.
  
 \\ \\
Line 60: Line 80:
  
  
-==== Construction ====+---- 
 + 
 + 
 +==== Construction ====  
  
 === Background ===  === Background === 
  
 The Alexa voice service (AVS), by Amazon, has a developer package called the Alexa SKills Kit (ASK) that allows users to create new skills for Alexa-enabled devices. All Alexa-enabled devices can perform functions with these skills. The voice-recognition software is taken care of when using this, leaving the developer to focus on designing the actual given commands. The Alexa voice service (AVS), by Amazon, has a developer package called the Alexa SKills Kit (ASK) that allows users to create new skills for Alexa-enabled devices. All Alexa-enabled devices can perform functions with these skills. The voice-recognition software is taken care of when using this, leaving the developer to focus on designing the actual given commands.
- 
-=== Method ===  
  
 Most Alexa skills are made up of a skill created with the Alexa Skills Kit in the AWS Developer Console and an AWS Lambda function that the skill sends its requests to. Most Alexa skills are made up of a skill created with the Alexa Skills Kit in the AWS Developer Console and an AWS Lambda function that the skill sends its requests to.
  
-Here we'll initiate the two parts:+Here we'll initiate the two parts and then connect our skill to complete it: 
  
 **Step 1**: **Step 1**:
 +
 +Sign into Amazon Developer Console:
  
 If you do not already have one, create an [[https://developer.amazon.com/|Amazon Developer Account]] and sign in to the **Amazon Developer Console**. If you do not already have one, create an [[https://developer.amazon.com/|Amazon Developer Account]] and sign in to the **Amazon Developer Console**.
-\\ 
-\\ 
  
-\\ 
-\\ 
 **Step 2**: **Step 2**:
  
-If you do not already have one, create an [[https://aws.amazon.com/console/|AWS Account]] and sign into the **AWS Management Console**.+Sign into AWS Management console: 
 + 
 +In a separate tab if you do not already have one, create an [[https://aws.amazon.com/console/|AWS Account]] and sign into the **AWS Management Console**.
  
 **Step 3**: **Step 3**:
  
 +Creating an IAM role:
 +
 +{{:sricoy:echo_music:selection_024.png?500|}}
 +
 +From the AWS Management Console, under the "Security, Identity, and Compliance" category, click "IAM". Go to "Roles" and create a new role. Name the role whatever you like; the name doesn't matter. Go to the next page.
 +
 +{{:sricoy:echo_music:selection_006.png?500|}}
 +
 +This role will be given to a Lambda function, so we will select that option. Click through to the next page.
 +
 +{{:sricoy:echo_music:selection_007.png?500|}}
 +
 +Select "CloudWatchlogsfullaccess" and "DynamoDBFullAccess". Click to the next page.
 +
 +Review and create your role.
 +
 +**Step 4**:
 +
 +Creating a Lambda function:
 +
 +In the upper right-hand corner, confirm that your region is set to "N. Virginia". The AVS can only work with Lambda in this region.
 +
 +In the upper left of the window, click "Services", and under "Compute" select "Lambda". You should get some sort of introduction page if you're new. Click through to get started and "Create a Lambda Function"
 +
 +{{:sricoy:echo_music:selection_012.png?500|}}
 +
 +Now we are prompted to select a blueprint. Click "Configure Triggers" on the left.
 +
 +{{:sricoy:echo_music:selection_013.png?500|}}
 +
 +Select the Alexa Skills Kit as your trigger. Move onto the next page.
 +
 +{{:sricoy:echo_music:selection_014.png?500|}}
 +
 +Here you'll be prompted to give a name, description, and runtime. The first two are for your reference. As for the runtime, for this demo we'll use Node.js 4.3.
 +
 +{{:sricoy:echo_music:selection_015.png?500|}}
 +
 +Scroll down to "Lambda function handler and role". We will use an existing role, and choose the name of the IAM role we created in the previous step.
 +
 +We will complete the Lambda function right after creating the Alexa Skill.
 +
 +
 +**Step 5**:
 +
 +Configuring the Alexa Skill:
 +
 +{{:sricoy:echo_music:selection_025.png?500|}}
 +
 +<!--==========================================================-->
 From the Amazon Developer Console dashboard, click on the "Alexa" tab and then "Get Started" with the Alexa Skills Kit. From the Amazon Developer Console dashboard, click on the "Alexa" tab and then "Get Started" with the Alexa Skills Kit.
  
-Click "Add a new skill". This will put you into the sequence for configuring your Alexa skill. +{{:sricoy:echo_music:selection_026.png?500|}} 
 + 
 +Click "Add a new skill". This will put you into the sequence for configuring your Alexa skill. We will be creating a "custom skill", so please select that option. 
 +  
 +{{:sricoy:echo_music:selection_027.png?500|}} 
 + 
 +The first section of the Alexa Skill, "Skill Information", allows you to change the skills name, as well as the invocation name. This is also where you can come back to find the skill's app ID after it is generated. 
 + 
 +The skill's name is for your reference and is what shows up in the Alexa app should your skill be published. The invocation name is what will be said aloud to initiate the skill with an Alexa-enabled device. There is a link next to these options that can teach you more about invocation phrases. 
 + 
 +Please select the "yes" radio button that confirms that our skill will use audio directives, since we will be playing music. Then click "Next"
 + 
 + 
 +This page allows us to create our interaction model. The top text box allows us to define what intents we would like to use in an intent schema. Intents are requests that can be triggered by verbal commands. A few of our intents are required by the Alexa skills kit because we will be streaming music. Intents are specified in JSON format. 
 + 
 +{{:sricoy:echo_music:selection_028.png?500|}} 
 + 
 +In the extracted folder, inside the "speechAssets" folder, there is a file named "IntentSchema". Copy and paste its contents into the intent schema text entry field. More information can be found on intents with the provided links next to the text box. 
 + 
 +{{:sricoy:echo_music:selection_029.png?500|}} 
 + 
 +Also in the "speechAssets" folder is a file named "Utterances.txt". Please copy and paste its contents into the "Sample Utterances" text entry field. The first portion of each line (here it is "PlayAudio") is the intent that is invoked by the sample phrase written after it. 
 + 
 +Click "Next" to save and move to the next page. 
 + 
 +**Step 6**: 
 + 
 +The Global Fields page is where we define where our requests will be sent. 
 + 
 +{{:sricoy:echo_music:selection_016.png?500|}} 
 + 
 + 
 +We need to complete our lambda function in order to complete this portion. Go into the "Skill Information" section of the Alexa skill and copy the Alexa Skill App ID. Go into the repository downloaded earlier and in the "constants.js" file, paste the Alexa skill App ID into the "appId" value. 
 + 
 + 
 +Scroll back up to "Lambda function code". For this step, open the repository that can be downloaded, found in the "Required Items" section above. 
 + 
 +With node.js installed, using a command line (the Git shell is what I used on Windows), navigate to the "js" directory within the "skill-sample-nodejs-audio-player" folder. 
 + 
 +Use this command: 
 + 
 + 
 +<fc green> 
 +npm install 
 +</fc>  
 + 
 + 
 +{{:sricoy:echo_music:selection_018.png?500|}} 
 + 
 + 
 +It will probably stall for a second, don't worry. Once complete, you will see a "node_modules" folder has been created in the "js" directory.  
 + 
 +{{:sricoy:echo_music:selection_019.png?500|}} 
 + 
 +Select everything within the "js" directory, and compress it into a zip file. 
 + 
 +{{:sricoy:echo_music:selection_030.png?300|}} 
 + 
 +We will now upload the created zip file into the section of our Lambda function called "Lambda function code". Move onto the next page and hit "Create function". Once created, you will be given an Amazon Resource Number (ARN) at the top right of the page. Copy this. 
 + 
 +{{:sricoy:echo_music:selection_021.png?500|}} 
 + 
 +In your Alexa Skill configuration page, select ARN as your service endpoint type, and paste your ARN into the text box, being sure to check the North America box. Move on to the next page and now everything is done! 
 + 
 +It's time to begin testing...maybe troubleshooting? 
 + 
 + 
 +----
  
  
 ==== Programming ==== ==== Programming ====
 + 
 +We got it working, so, how exactly does this thing work?
 +
 +Well, we'll go through a quick overview here, and I'll link relevant content as I go.
 +
 +1. **The Alexa Skill**
 +    
 +The skill built through the developer console has no actual programming involved. 
 +
 +What we do define is what the end user will say to invoke our skill, or navigate options within the skill.
 +
 +{{:sricoy:echo_music:alexainteractionmodel.png?nolink&600|}}
 +
 +The intent schema and sample utterances combined represent things the user might say, and what to do in response.
 +
 +The intent schema is where we set up which intents we will use. Intents are a way of specifying the request sent by the skill to our function in Lambda.
 +
 +In our example, we use a custom intent called "PlayAudio". In the sample utterances section, we wrote possible things the end user will say that we want to invoke the "PlayAudio" intent. You will notice that there are other intents in our intent schema. These are built-in intents and we do not have to define the possible utterances for that Alexa will understand. 
 +
 +For more on defining an Alexa skill's voice interface, please visit this [[https://developer.amazon.com/public/solutions/alexa/alexa-skills-kit/docs/defining-the-voice-interface#The Intent Schema|link]].
 +
 +{{:alexaglobalfields.png?nolink&500|}}
 +
 +
 +We must send our request somewhere, and the global fields section of our skill allows us to specify whether we are sending requests to our own URL or to an AWS Lambda function. It is easier to use the Amazon Resource Number (ARN) of a Lambda function, because we do not have to deal with any specifics concerning how our requests are sent to the Lambda function, we only need to write the code that our function will use to handle requests.
 +
 + 
 +2. **The Lambda Function**
 +
 +The Lambda function is where our code is hosted and responds when triggered by the Alexa skill. Otherwise, the function will sit idle and do nothing, making it very efficient for our purposes. With that said, let's take a brief look at our code below.
 +
 +<code javascript>
 +
 +'use strict';
 +
 +var Alexa = require('alexa-sdk'); //import Alexa SDK 
 +var constants = require('./constants'); //will need values from constants.js
 +var stateHandlers = require('./stateHandlers'); //need to register statehandlers
 +var audioEventHandlers = require('./audioEventHandlers');//must register audio event handlers
 +
 +exports.handler = function(event, context, callback){
 +    var alexa = Alexa.handler(event, context);
 +    alexa.appId = constants.appId;
 +    alexa.dynamoDBTableName = constants.dynamoDBTableName; //table name
 +    alexa.registerHandlers( //this function allows us to register our handlers
 +        stateHandlers.startModeIntentHandlers,//we can register multiple at a time
 +        stateHandlers.playModeIntentHandlers,
 +        stateHandlers.remoteControllerHandlers,
 +        stateHandlers.resumeDecisionModeIntentHandlers,
 +        audioEventHandlers //there are multiple in this file
 +    );
 +
 +    alexa.execute(); //just makes it all happen
 +    
 +};
 +
 +</code>
 +
 +
 +If you recall in the Lambda function, we use "index.handler" when we defined our handler in the configuration page. That is how our lambda function accesses the handlers that are used to process the requests. A handler looks for a specific intent and when it is invoked, runs the actions we program. Inside of the files imported into index.js, we can find the actual handlers. 
 +
 +For more on handling requests, please visit this link: [[https://developer.amazon.com/public/solutions/alexa/alexa-skills-kit/docs/handling-requests-sent-by-alexa|Handling Requests]]
 +
 +3. **IAM and the DynamoDB Table**
 +   
 +{{:iammanagement.png?nolink&500|}}
 +   
 +In our example, we don't explicitly setup an Amazon DynamoDB table, but rather it is set up by our Lambda function. However, when we created a role with the Identity and Access Management service (IAM) and set it as our Lambda function's role, we were giving our function permission to create that table for us, as well as access to our Amazon Cloudwatch service.
 +
 +Amazon CloudWatch is a service that allows us to track metrics of our Amazon Web Services (AWS) account. This ability isn't explicitly used by the skill, but there may be overlap in permissions from CloudWatch that let the function create a table in DynamoDB. 
 +
 +**NOTE:** If you intend to repurpose this sample and change what is actually played by the skill, you will need to go into DynamoDB and delete the table created by the function, as it holds details about what the user has played, but may not replace them correctly with new audio sources.
  
-A link to the source code can be found <provide URL to your code, probably saved in this DASL Wiki>. 
-\\ 
-The goal of the code is <brief explanation> It works in the following way 
-\\ 
----- 
-<!- Insert a snippet of your code here.  Try to keep to less than 0.5 page long --> 
 ---- ----
-\\ + 
-The snippet above serves to <fill in the blank>. It does this by <fill in the blank>+
-\\ +
----- +
-<!- Insert another snippet of your code here.  Try to keep to less than 0.5 page long --> +
----- +
-Next, the code does <fill in the blank> It does this by <fill in the blank>  +
-<!-- Keep entering snippets of code and descriptions until you've given enough for a reader to understand how it works --> +
-// +
-//+
 ==== Final Words ==== ==== Final Words ====
  
-This tutorial's objective was to <fill in the blank>Complete <choose: construction detailssource code and program descriptions> for <fill in the blank>. Once the concepts were conveyed the reader could <fill in the blank>+This walkthrough has gotten us off the ground with a music-playing Alexa skill, using built-in intents, Amazon DynamoDB, and Amazon IAMFor more on how skills workplease review other links [[echo_tutorial|here]] for an introduction to developing for Alexa
-\\ + 
-\\ +For more information on this particular sampleplease go through the README.md file
-Speculating future work derived from this tutorialincludes <fill in the blank>In the big picture, the problem of <fill in the blank> can be solved with this tutorial+ 
-\\ + 
-\\ + 
-For questions, clarifications, etc, Email: <paul.oh@unlv.edu> +For questions, clarifications, etc, Email: <ricoys1@unlv.nevada.edu> 
  
  
music_player_skill.1483075206.txt.gz · Last modified: 2016/12/29 21:20 by santiagoricoy