LeetCode 940: Distinct Subsequences II

Problem Description

Given a string s, return the number of distinct non-empty subsequences of s. Since the answer may be very large, return it modulo 10⁹ + 7.

A subsequence of a string is a new string that is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (i.e., “ace” is a subsequence of “abcde” while “aec” is not.

Example 1:

Input: s = “abc” Output: 7 Explanation: The 7 distinct subsequences are “a”, “b”, “c”, “ab”, “ac”, “bc”, and “abc”.

Example 2:

Input: s = “aba” Output: 6 Explanation: The 6 distinct subsequences are “a”, “b”, “ab”, “aa”, “ba”, and “aba”.

Example 3:

Input: s = “aaa” Output: 3 Explanation: The 3 distinct subsequences are “a”, “aa” and “aaa”.

Constraints:

1 <= s.length <= 2000
s consists of lowercase English letters.

Difficulty: Hard

Tags: string, dynamic programming

Rating: 97.87%

Solution Approach

Intuition

For each position i in the string, we want to calculate the number of distinct subsequences that end at position i.
When we encounter a new character, we can:
- Either use it to form new subsequences by appending it to all existing subsequences
- Or start a new subsequence with just this character
However, we need to handle duplicates carefully. If we’ve seen a character before, we need to subtract some combinations to avoid counting duplicates.

Dynamic Programming Solution

Here’s my Python solution with detailed explanations:

class Solution:
    def distinctSubseqII(self, s: str) -> int:
        MOD = 10**9 + 7
        dp = [1]  # Initialize with 1 (empty subsequence)
        last_occ = {}  # Track last occurrence of each character
        
        for i, c in enumerate(s):
            current = dp[-1]  # Get count from previous position
            # Double the count (append current char to all existing subsequences)
            new_count = (current * 2) % MOD
            
            # If we've seen this character before
            if c in last_occ:
                j = last_occ[c]
                # Subtract the count of subsequences that would create duplicates
                new_count = (new_count - dp[j]) % MOD
            
            dp.append(new_count)
            last_occ[c] = i  # Update last occurrence
            
        return (dp[-1] - 1) % MOD  # Subtract empty subsequence

Base Case: We start with dp[0] = 1, representing the empty subsequence.
State Transition: For each character c at position i:
- We multiply the previous count by 2 because we can append c to all existing subsequences
- This gives us all old subsequences plus all new ones with c appended
Handling Duplicates: When we encounter a character we’ve seen before:
- We need to subtract the count of subsequences that would create duplicates
- We can find this by looking at the dp value at the last occurrence of the character
Final Result: We subtract 1 from the final answer to exclude the empty subsequence

Example Walkthrough

Let’s trace through the example s = “aba”:

Start with empty subsequence: dp = [1]
Process ‘a’:
- Double previous (1 * 2 = 2)
- Haven’t seen ‘a’ before
- dp = [1, 2] (“a”)
Process ‘b’:
- Double previous (2 * 2 = 4)
- Haven’t seen ‘b’ before
- dp = [1, 2, 4] (“a”, “b”, “ab”)
Process ‘a’:
- Double previous (4 * 2 = 8)
- Seen ‘a’ before at position 0
- Subtract dp[0] = 1
- dp = [1, 2, 4, 7] (“a”, “b”, “ab”, “aa”, “ba”, “aba”)
Final result: 7 - 1 = 6 distinct non-empty subsequences

Complexity Analysis

Time Complexity: O(n), where n is the length of the string
Space Complexity: O(n) to store the dp array and last occurrences

The solution is efficient because we only need to iterate through the string once and perform constant-time operations at each step.