Balanced bracket sequences¶
A balanced bracket sequence is a string consisting of only brackets, such that this sequence, when inserted certain numbers and mathematical operations, gives a valid mathematical expression. Formally you can define balanced bracket sequence with:
- if
- if
For instance
Of course you can define other bracket sequences also with multiple bracket types in a similar fashion.
In this article we discuss some classic problems involving balanced bracket sequences (for simplicity we will only call them sequences): validation, number of sequences, finding the lexicographical next sequence, generating all sequences of a certain size, finding the index of sequence, and generating the
Balance validation¶
We want to check if a given string is balanced or not.
At first suppose there is only one type of bracket.
For this case there exists a very simple algorithm.
Let
If there are several bracket types involved, then the algorithm needs to be changed.
Instead of a counter
Number of balanced sequences¶
Formula¶
The number of balanced bracket sequences with only one bracket type can be calculated using the Catalan numbers.
The number of balanced bracket sequences of length
If we allow
Dynamic programming¶
On the other hand these numbers can be computed using dynamic programming.
Let
The initial value for this recurrence is
Finding the lexicographical next balanced sequence¶
Here we only consider the case with one valid bracket type.
Given a balanced sequence, we have to find the next (in lexicographical order) balanced sequence.
It should be obvious, that we have to find the rightmost opening bracket, which we can replace by a closing bracket without violation the condition, that there are more closing brackets than opening brackets up to this position. After replacing this position, we can fill the remaining part of the string with the lexicographically minimal one: i.e. first with as much opening brackets as possible, and then fill up the remaining positions with closing brackets. In other words we try to leave a long as possible prefix unchanged, and the suffix gets replaced by the lexicographically minimal one.
To find this position, we can iterate over the character from right to left, and maintain the balance
If we don't find a suitable position, then this sequence is already the maximal possible one, and there is no answer.
bool next_balanced_sequence(string & s) {
int n = s.size();
int depth = 0;
for (int i = n - 1; i >= 0; i--) {
if (s[i] == '(')
depth--;
else
depth++;
if (s[i] == '(' && depth > 0) {
depth--;
int open = (n - i - 1 - depth) / 2;
int close = n - i - 1 - open;
string next = s.substr(0, i) + ')' + string(open, '(') + string(close, ')');
s.swap(next);
return true;
}
}
return false;
}
This function computes in
Finding all balanced sequences¶
Sometimes it is required to find and output all balanced bracket sequences of a specific length
To generate then, we can start with the lexicographically smallest sequence
However, if the length of the sequence is not very long (e.g. next_permutation
, and check each one for balanceness.
Also they can be generated using the ideas we used for counting all sequences with dynamic programming. We will discuss the ideas in the next two sections.
Sequence index¶
Given a balanced bracket sequence with
Let's define an auxiliary array
For the start value
Now let us generate the index for a given sequence.
First let there be only one type of brackets.
We will use the counter
New let there be
Thus, when we look at the current character
This formula can be derived as follows:
First we "forget" that there are multiple bracket types, and just take the answer
Finding the -th sequence¶
Let
As in the previous section we compute the auxiliary array
First, we start with only one bracket type.
We will iterate over the characters in the string we want to generate.
As in the previous problem we store a counter
string kth_balanced(int n, int k) {
vector<vector<int>> d(2*n+1, vector<int>(n+1, 0));
d[0][0] = 1;
for (int i = 1; i <= 2*n; i++) {
d[i][0] = d[i-1][1];
for (int j = 1; j < n; j++)
d[i][j] = d[i-1][j-1] + d[i-1][j+1];
d[i][n] = d[i-1][n-1];
}
string ans;
int depth = 0;
for (int i = 0; i < 2*n; i++) {
if (depth + 1 <= n && d[2*n-i-1][depth+1] >= k) {
ans += '(';
depth++;
} else {
ans += ')';
if (depth + 1 <= n)
k -= d[2*n-i-1][depth+1];
depth--;
}
}
return ans;
}
Now let there be
Here is an implementation using two types of brackets: round and square:
string kth_balanced2(int n, int k) {
vector<vector<int>> d(2*n+1, vector<int>(n+1, 0));
d[0][0] = 1;
for (int i = 1; i <= 2*n; i++) {
d[i][0] = d[i-1][1];
for (int j = 1; j < n; j++)
d[i][j] = d[i-1][j-1] + d[i-1][j+1];
d[i][n] = d[i-1][n-1];
}
string ans;
int shift, depth = 0;
stack<char> st;
for (int i = 0; i < 2*n; i++) {
// '('
shift = ((2*n-i-1-depth-1) / 2);
if (shift >= 0 && depth + 1 <= n) {
int cnt = d[2*n-i-1][depth+1] << shift;
if (cnt >= k) {
ans += '(';
st.push('(');
depth++;
continue;
}
k -= cnt;
}
// ')'
shift = ((2*n-i-1-depth+1) / 2);
if (shift >= 0 && depth && st.top() == '(') {
int cnt = d[2*n-i-1][depth-1] << shift;
if (cnt >= k) {
ans += ')';
st.pop();
depth--;
continue;
}
k -= cnt;
}
// '['
shift = ((2*n-i-1-depth-1) / 2);
if (shift >= 0 && depth + 1 <= n) {
int cnt = d[2*n-i-1][depth+1] << shift;
if (cnt >= k) {
ans += '[';
st.push('[');
depth++;
continue;
}
k -= cnt;
}
// ']'
ans += ']';
st.pop();
depth--;
}
return ans;
}