📝 2 新 Issues
✅ 3 关闭
✨ 5 新 PRs
🎉 2 合并
💬 Issue/PR 动态
Issue 讨论
Issue #2007 Add missing authentication for RPC calls
- @LiebingYu: 建议将 RPC 分为 External 和 Internal 两类,并参考 FlussAuthorizationITCase 编写测试,同时指出 RPC 列表可能已过时
Issue #2900 [fluss-client] Support Complex Data Types on the Java API (NestedRow/ROW)
- @XuQianJin-Stars: 感谢详细审查,同意重构 PojoArrayToFlussArray 预编译转换器,处理 Map 泛型类型提取,并将测试移至 FlussTypedClientITCase
Issue #2974 [test] Unstable test RebalanceManagerITCase.testRebalanceWithRemoteLog
- @github-actions[bot]: CI 测试失败:testRebalanceWithRemoteLog 用例出现 AssertionFailedError
Issue #2982 [test] Unstable test CommitRemoteLogManifestITCase.testDeleteOutOfSyncReplicaLogAfterCommit
- @github-actions[bot]: CI 测试失败:testDeleteOutOfSyncReplicaLogAfterCommit 用例出现 AssertionError
Issue #2992 [test] Unstable test RemoteLogScannerITCase.testScanFromRemoteAndProject
- @github-actions[bot]: CI 测试失败:testScanFromRemoteAndProject 用例出现 AssertionFailedError
Issue #2995 [client] Support log scanner scan to arrow record batch
- @luoyuxia: 待办事项:为 fetch offset 添加 slice record batch 支持
Issue #3041 [spark] support scala 2.13
- @beryllw: LGTM,感谢贡献!已 cc wuchong 审核
Issue #3042 [spark] Support batch union read for lake-enabled primary key tables
- @Yohahaha: CI 因已知问题 #2992 失败,请求 @YannByron 帮助审核此 PR
Issue #3045 [common] Deep copy BinaryString in createDeepFieldGetter to fix use-after-free
- @luoyuxia: STRING 似乎不需要复制,担心复制会影响性能
- @fresh-borzoni: 建议采用 #3008 的 copyStrings 方案,更精确地处理 Arrow 和 CHAR 类型
- @YannByron: 感谢审查,#3008 对 STRING 复制有更详细解释,同意关闭此 PR 采用更优方案
Issue #3048 [lake][iceberg] Iceberg does not support union read for primary key table
- @MehulBatra: 当前架构对主键表启用联合读取会导致繁重的 merge-sort join,正通过 Deletion vectors 优化设计
Issue #3057 [spark] Fix column projection on log/upsert read path
- @fresh-borzoni: 请求 @luoyuxia @YannByron @Yohahaha 帮助审核
Issue #3058 chore: fix docs for RustFS AssumeRole
- @fresh-borzoni: 请求 @luoyuxia @leekeiabstraction 帮助审核,已添加文档链接到 #2989
PR Review
PR #2732 [common] A custom rounding policy that reduces Arrow's chunk size fm 16MB to 4MB which same as netty arena.
- @wuchong: 建议检查 RecordAccumulator 和 KvManager 是否也需要更新默认 RootAllocator,并重构为共享静态方法
PR #2740 [spark] support map type
- @beryllw: 建议检查是否需要调用 InternalRowUtils.copyArray,可能可以避免复制操作
PR #2802 [spark] refine the format when desc table (#2675)
- @Akash073-hub: 感谢反馈,将添加测试用例验证修复,稍后更新 PR
PR #2803 [client] Netty prefer heap memory
- @wuchong: 配置项 NETTY_CLIENT_ALLOCATOR_HEAP_BUFFER_FIRST 未被使用,建议从 Configuration 获取 prefer heap 选项而非硬编码
- @Copilot: 多处问题:setCumulator 方法不公开需用子类包装;API 签名变更需保留旧方法兼容;缺少 HeapPreferringCumulator 单元测试;删除 DefaultCompletedFetchBufferLifecycleTest 降低覆盖率
PR #3026 [client] Replace Netty PooledByteBufAllocator with bump-pointer ChunkedAllocationManager for Arrow memory allocation.
- @Copilot: 多处问题:ChunkedFactory.close() 未被调用导致内存泄漏;需要添加 ChunkedAllocationManager 单元测试;parseLimitScanResponse 中 LogRecordReadContext 未关闭;建议将 ChunkedAllocationManager 移至 arrow.memory 包
- @wuchong: 建议将类移至 org.apache.fluss.row.arrow.memory 包,添加单元测试,并修复 UNSAFE.allocateMemory(0) 可能导致的内存泄漏问题
PR #3032 [flink] Add Flink filter pushdown integration and documentation
- @wuchong: 多处审查意见:建议将 FlinkTableSource 构造函数参数改为 TableConfig;需要在 ALTER TABLE 中重新验证 statistics.columns;建议将验证逻辑移至 validateTableDescriptor;测试用例需要修复数据重复问题
PR #3047 [server] Manage users for sasl/plain authentication via cluster properties.
- @Copilot: 多处问题:公共静态方法签名变更是破坏性 API 变更;JAAS 配置未对用户名密码进行验证和转义;SUBTRACT 操作对带空格项处理不当;NettyServer 强制转换可能导致 ClassCastException;security.sasl.users 存储明文密码应标记为敏感配置
PR #3057 [spark] Fix column projection on log/upsert read path
- @Yohahaha: 建议将公共方法移至 FlussPartitionReader 类中以提高代码复用性
PR #3058 chore: fix docs for RustFS AssumeRole
- @Copilot: 文档使用 s3.path.style.access 但其他文档使用 s3.path-style-access,建议统一拼写;示例 ARN arn:aws:iam::0:role/... 不是有效的 AWS ARN 格式,应使用占位符或真实的 12 位账户 ID
- @fresh-borzoni: 已处理 Copilot 的审查意见
📝 新建 Issue/PR
Issues
- #3055 [flink] Fix NPE then flink reading in batch mode and "scan.startup.mode" is not "FULL". @gyang94
- #3052 [client] LogFetchCollector uses immutable fetchOffset for staleness check, causing cross-poll CompletedFetch to be drained @platinumhamburg
Pull Requests
- #3058 chore: fix docs for RustFS AssumeRole @fresh-borzoni
- #3057 [spark] Fix column projection on log/upsert read path @fresh-borzoni
- #3056 [fix] fix: add lakeSource check in batch mode @gyang94
- #3054 [common] Optimize getNullCounts() to return int[] instead of Long[] @platinumhamburg
- #3053 Fix stale fetch detection across poll boundaries @platinumhamburg
✅ 关闭 Issue/PR
已关闭 Issues
- #2726 Clear VectorSchemaRoot to release buffer as soon as possible after a batch read finined. @loserwang1024
- #2258 Support tier map type for iceberg @luoyuxia
- #1707 Iceberg support union read for primary key table in batch mode @beryllw
已合并 PRs
- #2900 [fluss-client] Support Complex Data Types on the Java API (NestedRow/ROW) @XuQianJin-Stars
- #2728 [client] Clear VectorSchemaRoot to release buffer as soon as possible after a batch read finished. @loserwang1024