Abstract:
Over past few years, Natural Language Generation has been receiving a lot of attention from
researchers for various applications but English sentence generation for Autistics communications
has received minimal attention. This research paper presents a novel approach for sentence
generation using “revised Purdom’s algorithm” in view of autistic children to generate sentences
that are both syntactically and semantically correct. Autism or Autism Spectrum Disorder (ASD) is
a disability, specifically found in children that results in problems related to communication and
social skills. Due to this disability an autistic child feel isolated and face difficulty in
communicating with others or lead a normal life.
Most of the current approaches for NLG and NLP mainly focus on generation of grammatically
correct sentences with little attention being paid to semantics. The aim of this research is to
restructure the “revised Purdom’s algorithm” to include concept of probabilities and pool of
sentences (specific or generic) to generate both grammatical and semantically correct sentences.
Probabilities are calculated for the user selected words (nouns) to determine its position in the
sentence by making use of a pre-prepared list of sentences. This list of sentences is further refined
based on user selected words. This is further used to calculate probability to determine occurrence
of a word (noun) in a sentence. Once determined, sentences are generated using
grammar/production rules given.
The proposed revised algorithm has been further implemented for testing and evaluation. Results
of our research showed that 68% of the generated sentences were grammatically and semantically
correct. Furthermore, another evaluation was performed by generating sentences using both
revised Purdom’s algorithm and our semantic-based algorithm.
II
Results and comparison showed that semantic-based algorithm was able to produce more accurate
sentences and is independent of the order of the selected words whereas the revised Purdom’s
algorithm is highly dependent on the orders of the words.