Ranking the sentences

Using our word stems from the last section, we will give some scores to our sentences

Our next step will be to go through the sentences we have, and give them a score (the higher the score, the more important the sentence) based on how many frequent words they contain. We will also "weight" the scores based on how early in the text the sentence appeared.

To do this, we will need to do some very similar stuff to what we did to compute the stems. In this case, we are going to duplicate some code, which is often regarded as a bad practice, but here it will help us solve the problem simply. After we have the code working we can go back and refactor it.

We will define a new array to store the sentences and their scores in, much like how we did with the stems object. After this, we will iterate through the sentences again, this time assigning them an ever decreasing base score according to how early the occur in the text.

We can then use the same map and filter statements from before to clean up our words again, and finally, use a reduce function to enumerate the total score for the sentence.

the array reduce function in JavaScript will iterate through the array. On each pass, it will carry over the return value from the previous iteration (on the first iteration it will use the second parameter provided to it as a starting point) as well as the current item from the array. this lets us "reduce" arrays of data to a single value (the return value from the final iteration).

In this case, will will "reduce" the array down to the score of all the words in the array.

Finally, we will push an object to our new array containing the sentence and its associated score.

const ranked = [];

for (index in sentences) {
  const sentence = sentences[index];
  let baseScore = sentences.length - index;

  score = sentence
    .split(' ')
    .map((word) => {
      const cleanedWord = word.match(/([a-zA-Z]+)/);
      return cleanedWord && cleanedWord[0]; // remove any punctuation
    })
    .filter((word) => {
      return word && !stopwords.includes(word); // make sure it isn't a stopword or null
    })
    .reduce((acc, curr) => {
      return acc + stems[curr];
    }, baseScore);

  ranked.push({
    sentance,
    score,
  });
}

Now that we have scored each sentence, we will get the n highest score sentences and return them as our summary.

Last updated