Skip to content

Conversation

@cosmo0920
Copy link
Contributor

@cosmo0920 cosmo0920 commented Oct 24, 2025

Corresponding of fluent/fluent-bit#10691.

Summary by CodeRabbit

  • Documentation
    • Updated S3 documentation link and clarified how compression and format options map to availability.
    • Expanded Parquet guidance: prerequisites, enablement steps, testing/usage examples, and typical upload behavior.
    • Clarified gzip content-encoding guidance with validation examples.
    • Minor formatting and structure cleanups to improve readability.

✏️ Tip: You can customize this high-level summary in your review settings.

@cosmo0920 cosmo0920 requested review from a team as code owners October 24, 2025 05:37
@cosmo0920 cosmo0920 force-pushed the cosmo0920-add-parquet-c-description-for-out_s3 branch 4 times, most recently from 6e4b47e to b659055 Compare October 24, 2025 05:46
@cosmo0920
Copy link
Contributor Author

cosmo0920 commented Oct 24, 2025

@esmerel Good morning.
Vale warns some of usages that include Parquet is not following sentence style of capitalization. But Parquet is a proper noun. How do we handle on such warnings?

@eschabell eschabell self-assigned this Oct 27, 2025
@eschabell eschabell requested a review from esmerel October 27, 2025 20:44
@eschabell eschabell added the waiting-on-review Waiting on a review from mainteners label Oct 27, 2025
@eschabell
Copy link
Collaborator

@esmerel review needed, see questions above.

esmerel added a commit that referenced this pull request Oct 27, 2025
@esmerel
Copy link
Contributor

esmerel commented Oct 27, 2025

@esmerel Good morning. Vale warns some of usages that include Parquet is not following sentence style of capitalization. But Parquet is a proper noun. How do we handle on such warnings?

We add it to the spelling exceptions file for Vale. I went ahead and did that in #2116 =)

And also the heading file, which I should have realized. #2117

Copy link
Contributor

@esmerel esmerel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one suggestion that looks accidental

@cosmo0920 cosmo0920 force-pushed the cosmo0920-add-parquet-c-description-for-out_s3 branch from 0b5a2a3 to 3b4b0d4 Compare October 28, 2025 03:59
TomlinfreeGit pushed a commit to TomlinfreeGit/fluent-bit-docs that referenced this pull request Oct 28, 2025
Signed-off-by: Lynette Miles <[email protected]>
Signed-off-by: Tom <[email protected]>
Copy link
Contributor

@patrick-stephens patrick-stephens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine other than adding the legacy format config as well.

@patrick-stephens patrick-stephens added waiting-for-user Waiting for user/contributors feedback or requested changes and removed waiting-on-review Waiting on a review from mainteners labels Nov 5, 2025
cosmo0920 and others added 2 commits November 21, 2025 19:10
- Apply suggestion from @esmerel

Co-authored-by: Lynette  Miles <[email protected]>
Signed-off-by: Hiroshi Hatake <[email protected]>
@cosmo0920 cosmo0920 force-pushed the cosmo0920-add-parquet-c-description-for-out_s3 branch from 3b4b0d4 to 0e3ab85 Compare November 21, 2025 10:17
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 21, 2025

Walkthrough

Documentation updates for the S3 output: replaced a CloudWatch link with the official Amazon S3 guide, merged compression description into a “Compression/format” section clarifying build-time Arrow options, and added detailed Parquet prerequisites, build steps, examples, usage notes, plus a duplicated Parquet block and minor formatting tweak.

Changes

Cohort / File(s) Summary
S3 Documentation Updates
pipeline/outputs/s3.md
Replaced CloudWatch link with official Amazon S3 user guide; consolidated compression list into a “Compression/format” section clarifying which formats require Arrow at build time and gzip/Content-Encoding behavior; added explicit Parquet prerequisites, "Enable Parquet support" build/test instructions, example fluent-bit.yaml and fluent-bit.conf, usage notes (use_put_object vs multipart), and a duplicated Parquet enablement block plus a minor tab/blank formatting adjustment.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Verify accuracy of Parquet build and dependency instructions (Arrow, GLib/Parquet GLib).
  • Confirm example configurations (fluent-bit.yaml, fluent-bit.conf) are correct and runnable.
  • Check the duplicated Parquet block—remove if unintentional.
  • Validate the updated S3 user guide URL and the gzip/Content-Encoding note.

Poem

🐇 I hopped through docs with nimble feet,
I fixed a link and made guide neat.
Parquet paths I tucked in tight,
Compress and test — the examples light.
A joyful nibble — docs take flight! 🥕✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding documentation for pure C Parquet support in the S3 output plugin, which is the primary focus of the documentation updates.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cosmo0920-add-parquet-c-description-for-out_s3

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 848e027 and 84b8c9a.

📒 Files selected for processing (1)
  • pipeline/outputs/s3.md (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • pipeline/outputs/s3.md

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2dee9d8 and 0e3ab85.

📒 Files selected for processing (1)
  • pipeline/outputs/s3.md (4 hunks)
🔇 Additional comments (3)
pipeline/outputs/s3.md (3)

700-723: LGTM on build requirements documentation.

The Parquet build requirements section is clear and actionable, with a concrete Ubuntu/Debian example and proper references to external documentation for other distributions.


728-783: LGTM on testing configurations.

The example configurations for Parquet testing are well-structured, providing both YAML and conf formats with appropriate settings for testing (debug logging, dummy input with diverse data types, and use_put_object enabled).


700-783: Verify: Has the legacy format request been addressed?

A previous review comment from patrick-stephens requested: "We should add the legacy format as well." The current Parquet section doesn't explicitly mention legacy compression formats or other format options.

Please clarify:

  1. Does this PR need to document additional compression formats beyond gzip and parquet?
  2. If so, should they be added to this section or elsewhere in the documentation?

Signed-off-by: Hiroshi Hatake <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e3ab85 and 848e027.

📒 Files selected for processing (1)
  • pipeline/outputs/s3.md (4 hunks)
🔇 Additional comments (4)
pipeline/outputs/s3.md (4)

9-9: ✓ Helpful documentation link update.

The change from the CloudWatch Logs S3 link to the official AWS S3 user guide is more appropriate and direct for users referencing S3 capabilities.


39-39: ✓ Comprehensive compression parameter documentation.

The expanded compression description clearly distinguishes between:

  • Always-available formats (gzip with Content-Encoding header behavior)
  • Build-time conditional formats (parquet requiring -DFLB_ARROW=On and Arrow GLib/Parquet GLib)
  • Typical usage patterns (Parquet with use_put_object On)

This is precise and helpful for users evaluating their build configuration.


700-722: ✓ Clear build requirements and instructions for Parquet support.

The new "Build requirements for Parquet" section provides:

  • Specific package installation steps for Ubuntu/Debian
  • CMake command with the required -DFLB_ARROW=On flag
  • Reference to upstream Apache Parquet installation docs for other distributions

This guidance aligns well with the parameter documentation and enables users to set up Parquet correctly.


724-781: ✓ Thorough testing section with dual-format examples.

The "Testing Parquet support" section includes:

  • Complete service and pipeline configuration in both YAML and text (.conf) formats
  • Realistic input (dummy input with mixed data types)
  • Key Parquet-specific settings (compression: parquet, use_put_object: On)
  • Comments noting where additional parameters can be added

Examples are clear and actionable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

waiting-for-user Waiting for user/contributors feedback or requested changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants