Problem Description
Given a string s, return the number of distinct non-empty subsequences of s
. Since the answer may be very large, return it modulo 109 + 7
.
“ace”
is a subsequence of “abcde”
while “aec”
is not.
Example 1:
Input: s = “abc” Output: 7 Explanation: The 7 distinct subsequences are “a”, “b”, “c”, “ab”, “ac”, “bc”, and “abc”.
Example 2:
Input: s = “aba” Output: 6 Explanation: The 6 distinct subsequences are “a”, “b”, “ab”, “aa”, “ba”, and “aba”.
Example 3:
Input: s = “aaa” Output: 3 Explanation: The 3 distinct subsequences are “a”, “aa” and “aaa”.
Constraints:
1 <= s.length <= 2000
s
consists of lowercase English letters.
Difficulty: Hard
Tags: string, dynamic programming
Rating: 97.87%
Solution Approach
Intuition
-
For each position i in the string, we want to calculate the number of distinct subsequences that end at position i.
-
When we encounter a new character, we can:
- Either use it to form new subsequences by appending it to all existing subsequences
- Or start a new subsequence with just this character
-
However, we need to handle duplicates carefully. If we’ve seen a character before, we need to subtract some combinations to avoid counting duplicates.
Dynamic Programming Solution
Here’s my Python solution with detailed explanations:
class Solution:
def distinctSubseqII(self, s: str) -> int:
MOD = 10**9 + 7
dp = [1] # Initialize with 1 (empty subsequence)
last_occ = {} # Track last occurrence of each character
for i, c in enumerate(s):
current = dp[-1] # Get count from previous position
# Double the count (append current char to all existing subsequences)
new_count = (current * 2) % MOD
# If we've seen this character before
if c in last_occ:
j = last_occ[c]
# Subtract the count of subsequences that would create duplicates
new_count = (new_count - dp[j]) % MOD
dp.append(new_count)
last_occ[c] = i # Update last occurrence
return (dp[-1] - 1) % MOD # Subtract empty subsequence
-
Base Case: We start with dp[0] = 1, representing the empty subsequence.
-
State Transition: For each character c at position i:
- We multiply the previous count by 2 because we can append c to all existing subsequences
- This gives us all old subsequences plus all new ones with c appended
-
Handling Duplicates: When we encounter a character we’ve seen before:
- We need to subtract the count of subsequences that would create duplicates
- We can find this by looking at the dp value at the last occurrence of the character
-
Final Result: We subtract 1 from the final answer to exclude the empty subsequence
Example Walkthrough
Let’s trace through the example s = “aba”:
-
Start with empty subsequence: dp = [1]
-
Process ‘a’:
- Double previous (1 * 2 = 2)
- Haven’t seen ‘a’ before
- dp = [1, 2] (“a”)
-
Process ‘b’:
- Double previous (2 * 2 = 4)
- Haven’t seen ‘b’ before
- dp = [1, 2, 4] (“a”, “b”, “ab”)
-
Process ‘a’:
- Double previous (4 * 2 = 8)
- Seen ‘a’ before at position 0
- Subtract dp[0] = 1
- dp = [1, 2, 4, 7] (“a”, “b”, “ab”, “aa”, “ba”, “aba”)
-
Final result: 7 - 1 = 6 distinct non-empty subsequences
Complexity Analysis
- Time Complexity: O(n), where n is the length of the string
- Space Complexity: O(n) to store the dp array and last occurrences
The solution is efficient because we only need to iterate through the string once and perform constant-time operations at each step.