Let’s get into last part of series for Twitter Analysis and so far what we covered are
- HANA studio overview
- Basic SQL
- How tables are created
- Custom Configuration
Previous blogs link on same topic:
http://scn.sap.com/docs/DOC-56325
Let’s see how to create Custom Dictionary:
Follow the same path as we mentioned in previous blog for creating Custom Configuration to create Custom Dictionary.
--How to create Dictionary Name--
(XYZ).hdbtextdict "You can give name as per naming convention in place of XYZ"
Copy below code in Custom Dictionary that we created just now:
<?xml
version="1.0" encoding="UTF-8"?>
< dictionary xmlns="http://www.sap.com/ta/4.0">
< entity_category name="PERSON">
< entity_name standard_form="Virat Kohli">
< variant name="virat" />
< variant name="Virat" />
< variant name="virat kohli" />
< variant name="Virat Kohli" />
<variant_generation type="standard" language="english" />
< /entity_name>
< /entity_category>
< /dictionary>
Note –
What we achieved with above code? We have given few combinations how users can enter Virat Kohli name, but we have maintained the uniformity by giving< entity_name standard_form="Virat Kohli">.
So once we run the SQL on TA_TYPE = ‘PERSON’, we will not get different combinations for Virat Kohli rather it will be one row only for this
(Refer to blog 1 one for this).
We can keep on modifying the Custom Dictionary when & wherever needed.
Let’s include this in Custom Configuration now
1. Open the Custom Configuration we created in Blog 2.
2. Search for <!-- List of Text Analysis extraction dictionaries for
SentimentAnalysis. -->
Come down , before </property> </configuration> add below code
<string-list-value>sap.hana.ta.config::(XYZ).hdbtextdict</string-list-value>
Let’s create Index now
CREATE FULL TEXT INDEX ipl ON "IPL"."IPL Match_Twitter Data" ("tweetContent")
CONFIGURATION 'sap.hana.ta.config'::CONFIGURATION_NAME
TEXT ANALYSIS ON
Note
– In CONFIGURATION_NAME , pass Configuration name which we created in Blog 2.
What’s next?
Run the SQL now and for sure what we received in below screen shot in Blog 1 will change.
Try and See the difference
Bye for now, will see you with something new.