-
Notifications
You must be signed in to change notification settings - Fork 150
Fix array compound literal parsing #309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| } | ||
| lex_expect(T_close_curly); | ||
| var->array_size = count; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a new blank line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a blank line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
DrXiao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add some test cases to the test suite for validation.
jserv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use static qualifier since shecc does not support. Check 'COMPLIANCE.md' carefully.
|
You MUST ensure bootstrapping is fully functional before submitting pull requests. |
|
IIUC, compound literals are a feature supported only since C99, and the shecc README mentions from the very beginning that this project aims to support ANSI C. Therefore, IMO, this at least does not "fix" anything. |
As far as I know, shecc is planned to fully support the C99 standard, so I think the term "fix" is acceptable. |
|
Fine, but since we haven't claimed to fully support C99 and, IIUC, array compound literals were never supported before, this seems more like supporting a new feature to me, rather than fixing an existing problem. |
|
In fact, shecc has ability to handle array compound literals, but it only captures the first element (#299). Therefore, this pull request specifically aims to fix it. |
Thanks, that resolves my doubt. |
jserv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't add tests/array_ptr.c. Instead, consolidate tests/driver.sh.
visitorckw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rebase your branch to keep the git history clean.
Commits that fix problems introduced within the same pull request should be avoided.
bf5f3d0 to
b88294c
Compare
|
Looking at the changes, the core logic for handling array compound literals looks solid—good job centralizing the decay behavior to avoid ad-hoc fixes everywhere. But yeah, there are some style inconsistencies and comment opportunities that could make this even tighter. Here's what I'd suggest, focused on Style Fixes for
|
0de5d30 to
08f1e29
Compare
|
I defer to @ChAoSUnItY for confirmation. |
| fatal("Unsupported truncation operation with invalid target size"); | ||
| } | ||
| return; | ||
| case OP_sign_ext: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why removing the curly braces while this case branch has variable declaration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn’t really mean to drop those braces—they were just left behind while trying other tweaks. but it’s fine either way because C99 lets us declare source_size right after the case label without changing behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, although C99 allows this. But since in our compiler's standard workflow, stage 0 will be compiled by GCC with -Wpedantic compilation option included, and this will cause building process to output warning even in this case it didn't do anything in a harmful way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted it to the original order in my latest commit
08f1e29 to
b13f5b6
Compare
dbea9ca to
4788e86
Compare
|
@jserv I have rebased to master, and I saw my branch is up to date with 'origin/master'. I am confused if the conflict is because of the rebase issue. |
|
I guess your origin points to your own repository on github instead of the upstream repository under sysprog21? |
yes! Thank you for pointing out! |
4788e86 to
cc86cd4
Compare
jserv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use git rebase -i to rework commits, squashing intermediate ones.
cc86cd4 to
b619761
Compare
|
@lorettayao, note that the commit message body should be wrapped at 72 characters. |
Unify and correct the handling of array compound literals across parsing, semantic analysis, and lowering. The compiler now builds a temporary array by writing each element, tracking initializer counts, and returning the array’s address instead of collapsing the literal to its first element. Decay to a scalar is performed only when a scalar value is required. A centralized helper now governs literal decay, replacing scattered ad-hoc callers in binary operators, assignments, function-call arguments, and ternary expressions. This resolves cases where array literals were incorrectly forced to scalars in pointer contexts and brings the behavior closer to standard C expectations. The update also corrects scalarization in variadic and pointer arithmetic contexts, ensures pointer-typed ternary results, handles zero-length array literals as constant zero, avoids double-brace consumption in the parser, and regenerates ARM and RISC-V snapshots to match the corrected lowering. These changes restore correct pointer semantics for array compound literals and address sysprog21#299.
b619761 to
79847d5
Compare
src/parser.c
Outdated
| void parse_array_compound_literal(var_t *var, | ||
| block_t *parent, | ||
| basic_block_t **bb, | ||
| bool emit_code); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this forward declaration is unnecessary here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I found it unnecessary, will remove
| } | ||
| lex_expect(T_close_curly); | ||
| var->array_size = count; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a blank line.
79847d5 to
a4aba54
Compare
DrXiao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second commit (a4aba54) modified both the C source code and the test suites, but its message only describes the latter. The message does not accurately reflect the changes made.
DrXiao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice that the term "decay" appears in the commit messages, but I am not sure why it is used. Could you explain the reason behind this choice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notice that a blank line is required between two function definitions. For example:
Correct example:
void func1(void)
{
/* ... */
}
/* (Comment)
* ...
* ...
*/
int func2(void)
{
/* ... */
}
void func3(int a)
{
/* ... */
}Incorrect example:
void func1(void)
{
/* ... */
}
/* (Comment)
* ...
* ...
*/
int func2(void)
{
/* ... */
}
void func3(int a)
{
/* ... */
}
i use the word decay to specify fixes around treating array compound literals this way—e.g., assigning (int[]){1,2,3} to an int * or passing it to a function should transparently yield a pointer |
Add regression tests that verify pointer and decay semantics of array compound literals. The tests cover decay during initialization, passing array literals to functions that expect pointer arguments, pointer-typed ternary expressions involving literal branches, and correctness for char[] and short[] literals in simple computations. These tests confirm the behavior introduced by the refined compound literal semantics implemented in the preceding commit. Parser cleanup: flattened the nested parameter-handling conditionals in parser.c so array-literal parameters are scalarized and promoted/resized without deep if-nesting, improving readability while preserving the decay semantics enforced by the new tests
a4aba54 to
3e1f41a
Compare
jserv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The work touches too many places with similar boilerplate:
if (!var->ptr_level && var->array_size == 0)
rhs = scalarize_array_literal(parent, &bb, rhs, var->type);
This pattern appears in:
- read_body_assignment (line 3815)
- read_body_statement assignments (line 4563)
- read_body_statement nv assignments (line 4661)
- read_ternary_operation twice
Problem: The special case logic is multiplying, not disappearing.
| bool is_array_literal_placeholder(var_t *var) | ||
| { | ||
| return var && var->array_size > 0 && !var->ptr_level && | ||
| var->var_name[0] == '.'; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Detecting temporaries by checking if name starts with . is fragile. This is pattern-matching on internal naming conventions rather than explicit type tracking.
Better approach: Add an explicit is_compound_literal flag to var_t.
| if (func && param_num < func->num_params) { | ||
| var_t *target = &func->param_defs[param_num]; | ||
| if (!target->ptr_level && !target->array_size) | ||
| param = | ||
| scalarize_array_literal(parent, bb, param, target->type); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only scalarizes for defined params, but variadic args (func->va_args) might still misbehave. Test case needed:
printf("%p", (int[]){1,2,3}); // Should print pointer, not 1
| var->array_size = count; | ||
| } | ||
| /* Identify compiler-emitted temporaries that hold array compound literals. | ||
| * Parsing assigns these temporaries synthetic names via gen_name_to (".tN") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add explicit compound literal tracking instead of .tN name inspection.
|
Consider test case below // Variadic function with array literal
int main(void) {
int *p = (int[]){42, 43, 44};
return *p == 42 ? 0 : 1;
}The existing tests cover basic pointer decay but not all edge cases from the cubic analysis. |
Summary
Fix a segfault when evaluating array compound literals like
(int[]){1,2,3,4,5}in expression context. The literal now yields the temporary array’s address (via decay) instead of collapsing to a single element, so indexing and reads work correctly.Motivation
Previously, code like the snippet below crashed because the array literal was reduced to a scalar and later treated as a pointer.
Reproduction (manual)
Before: segfault
After: prints
a[0..4]and Sum = 6 as expectedApproach (high level)
(type[]){…}produces a real temporary array object and the expression value decays to its address.Scope
Tests
Compatibility
Issue
Summary by cubic
Fix parsing of array compound literals so they allocate a temporary array and decay to its address, preserving pointer semantics and preventing segfaults.
Bug Fixes
Refactors
Written for commit 3e1f41a. Summary will update automatically on new commits.