(pinot) branch master updated: docs: enhance copilot-instructions.md with comprehensive guidelines (#16693)

xiangfu Thu, 28 Aug 2025 09:10:15 -0700

This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/pinot.git



The following commit(s) were added to refs/heads/master by this push:
     new 71112fa2d9c docs: enhance copilot-instructions.md with comprehensive 
guidelines (#16693)
71112fa2d9c is described below

commit 71112fa2d9c0e8226191e211c05a2c0028bafec8
Author: Xiang Fu <[email protected]>
AuthorDate: Thu Aug 28 09:10:00 2025 -0700

    docs: enhance copilot-instructions.md with comprehensive guidelines (#16693)
    
    * Refine by gpt-5
    
    * docs: enhance copilot-instructions.md with comprehensive guidelines
    
    - Add testing best practices, performance considerations, and security 
guidelines
    - Include error handling, concurrency, configuration, and documentation 
standards
    - Add test coverage guidelines with specific targets and tools
    - Expand assistant guidance with coding patterns and best practices
    - Document code review process and repository conventions
    - Incorporate domain-specific conventions for timezones and imports
---
 .github/copilot-instructions.md | 135 +++++++++++++++++++++++++++++++++++++---
 1 file changed, 127 insertions(+), 8 deletions(-)

diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index 97bba55e73a..417db865d80 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -1,22 +1,141 @@
 # Project overview
-Originally developed at LinkedIn, Apache PinotTM is a real-time distributed 
OLAP datastore, 
+Originally developed at LinkedIn, Apache PinotTM is a real-time distributed 
OLAP datastore,
 purpose-built to provide ultra low-latency analytics at extremely high 
throughput.
 
-With its distributed architecture and columnar storage, Apache Pinot empowers 
businesses to gain valuable insights from 
+With its distributed architecture and columnar storage, Apache Pinot empowers 
businesses to gain valuable insights from
 real-time data, supporting data-driven decision-making and applications.
 
+This document guides AI coding assistants (Copilot, Cursor, etc.) contributing 
to this repository.
+
 ## Libraries and Frameworks
 
 - Most source code is written in Java 11.
 - Code in `pinot-clients` is written in Java 8.
 - Pinot UI is a React.js frontend. It is stored in 
`pinot-controller/src/main/resources/`.
-- The code is compiled with Maven and follows a multimodule structure. Use the 
`pom.xml` file on each module to 
+- The code is compiled with Maven and follows a multimodule structure. Use 
each module's `pom.xml` to
   understand dependencies and configurations.
 
 ## Coding Standards
 
-* Generate header files for all files that you support, including Java, XML, 
JSON and markdown
-* Javadoc comments can either be written using the `/** ... */` syntax or the 
`///` syntax, 
-  as specified in [JEP-467](https://openjdk.org/jeps/467)
-* Method and parameters are not null by default, unless specified otherwise 
with the 
-  `javax.annotation.Nullable` annotation
\ No newline at end of file
+- Include the Apache Software Foundation (ASF) license header in all supported 
file types (Java, XML, JSON, Markdown).
+- Javadoc comments can either be written using the `/** ... */` syntax or the 
`///` syntax,
+  as specified in [JEP-467](https://openjdk.org/jeps/467).
+- Methods and parameters are non-null by default unless annotated with 
`javax.annotation.Nullable`.
+- Prefer explicit imports for classes rather than fully qualified names used 
inline.
+- Use SLF4J for logging; do not use `System.out` or `System.err`.
+- Propagate exceptions with useful context; avoid blanket `catch (Exception)` 
and never swallow errors.
+- Keep diffs minimal: do not reformat unrelated code and preserve existing 
whitespace/indentation styles.
+- Favor early returns and guard clauses; avoid deep nesting beyond two to 
three levels.
+- Add concise comments that explain "why", not "how"; avoid TODOs—implement 
instead.
+
+## Repository and Workflow Conventions
+
+- Use SSH Git remotes (`[email protected]:...`) rather than HTTPS.
+- Keep changes small and focused; include or update tests when behavior 
changes.
+- Write descriptive commit messages and PR descriptions; link related issues 
when relevant.
+- Prefer absolute file paths in discussions and tool calls for clarity.
+- Do not commit generated files, large binaries, or secrets.
+
+## Build and Test
+
+- Build with Maven from the repository root.
+- Run unit tests locally before opening a PR; ensure the project compiles 
cleanly without introducing lint errors.
+- Respect the multi-module structure; build order is governed by Maven.
+
+## Domain-specific conventions
+
+- Timezones: When defining or updating `STANDARD_TIMEZONES`, use a 
three-letter timezone ID with location in the
+  format `PDT(America/Los_Angeles)`. Prefer specific timezone IDs over generic 
UTC offsets, ensure comprehensive
+  coverage of common timezones, and sort entries by UTC offset.
+- Imports: Favor importing classes at the top of files instead of using fully 
qualified class names inline.
+
+## Testing Best Practices
+
+- Write unit tests for new functionality and bug fixes; follow the existing 
test naming conventions.
+- Use Mockito for mocking dependencies in unit tests.
+- Include integration tests when modifying core components or data flows.
+- Add performance tests for optimization changes; measure impact with 
realistic data volumes.
+- Use `@Test(expected = Exception.class)` sparingly; prefer `@Test` with 
try-catch for specific assertions.
+- Keep test methods focused and descriptive; avoid overly complex test setups.
+
+## Test Coverage Guidelines
+
+- Aim for high branch coverage (≥80%) on new code and modified code paths.
+- Focus on testing business logic rather than trivial getters/setters or 
framework boilerplate.
+- Prioritize coverage of error handling paths, edge cases, and boundary 
conditions.
+- Use JaCoCo or similar tools for measuring coverage; include coverage reports 
in CI builds.
+- Don't artificially inflate coverage with trivial tests; quality matters more 
than quantity.
+- Cover both happy path and failure scenarios, including timeout and resource 
exhaustion cases.
+- For critical components (e.g., query execution, data ingestion), maintain 
≥90% coverage.
+- Document coverage expectations in PR descriptions when coverage goals aren't 
met.
+- Use mutation testing (e.g., PITest) for critical code to ensure test quality.
+- Exclude generated code, test utilities, and simple data transfer objects 
from coverage requirements.
+
+## Performance Considerations
+
+- Be mindful of memory usage in data processing pipelines; avoid loading large 
datasets into memory unnecessarily.
+- Use efficient data structures (e.g., primitive arrays over collections for 
numerical data).
+- Consider lazy evaluation for expensive operations that might not always be 
needed.
+- Profile performance-critical code paths and document assumptions about data 
scale.
+- Use appropriate collection types based on access patterns (e.g., ArrayList 
for iteration, HashMap for lookups).
+
+## Security Guidelines
+
+- Never log sensitive information like passwords, API keys, or personally 
identifiable data.
+- Validate all user inputs and external data sources; sanitize inputs to 
prevent injection attacks.
+- Use parameterized queries for database operations to prevent SQL injection.
+- Implement proper authentication and authorization checks for sensitive 
operations.
+- Follow the principle of least privilege when granting permissions.
+
+## Error Handling and Logging
+
+- Use appropriate log levels: ERROR for failures, WARN for concerning 
situations, INFO for important events, DEBUG for detailed troubleshooting.
+- Include relevant context in log messages (e.g., request IDs, user context, 
operation parameters).
+- Use structured logging with key-value pairs for better searchability and 
analysis.
+- Create custom exception types for domain-specific errors rather than generic 
RuntimeException.
+- Include stack traces only in DEBUG level; avoid in production logs to 
prevent information leakage.
+
+## Concurrency and Threading
+
+- Use thread-safe data structures (e.g., ConcurrentHashMap) in multi-threaded 
contexts.
+- Prefer immutable objects for shared state to avoid race conditions.
+- Use appropriate synchronization mechanisms; favor higher-level constructs 
over synchronized blocks.
+- Document thread safety guarantees in class-level Javadoc.
+- Be aware of Pinot's distributed nature; consider consistency models and 
eventual consistency scenarios.
+
+## Configuration Management
+
+- Externalize configuration values; avoid hardcoding environment-specific 
settings.
+- Use meaningful default values with clear documentation about their impact.
+- Validate configuration at startup and provide helpful error messages for 
invalid settings.
+- Document all configuration properties in code comments or separate 
configuration docs.
+- Use consistent naming conventions for configuration keys (e.g., 
pinot.server.port).
+
+## Documentation Standards
+
+- Document public APIs with comprehensive Javadoc including parameter 
descriptions, return values, and exceptions.
+- Include code examples in documentation where helpful for complex APIs.
+- Update documentation when changing behavior; keep README files current.
+- Use consistent terminology across documentation and code comments.
+- Document performance characteristics and limitations for important 
components.
+
+## Code Review Process
+
+- Address all review comments thoughtfully; if you disagree, explain your 
reasoning clearly.
+- Keep PRs focused on a single concern; split large changes into smaller, 
reviewable commits.
+- Include context in PR descriptions: what problem you're solving, how you 
solved it, and any trade-offs.
+- Test your changes thoroughly before requesting review; include test results 
when relevant.
+- Be responsive to feedback and iterate quickly on requested changes.
+
+## Assistant guidance
+
+- When editing, touch only what is necessary; keep changes minimal and 
localized.
+- Preserve the file's existing indentation and whitespace; do not reflow or 
reformat unrelated code.
+- When creating new files, include the ASF license header and all required 
imports/dependencies.
+- Avoid adding extremely long hashes or non-textual content; keep repository 
contents textual and reviewable.
+- If uncertain about where a change belongs, search for similar code in the 
codebase and follow established patterns.
+- Use absolute file paths in discussions and tool calls for clarity.
+- Prefer early returns and guard clauses to reduce nesting and improve 
readability.
+- When implementing complex logic, break it into smaller, well-named methods.
+- Consider edge cases and failure scenarios in your implementations.
+- Follow the existing code patterns and architectural decisions in the 
codebase.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(pinot) branch master updated: docs: enhance copilot-instructions.md with comprehensive guidelines (#16693)

Reply via email to