Contributing to YADT
How to contribute to YADT
About Language
- Issues can be in Chinese or English
- PRs are limited to English
- All documents are provided in English only
Did you find a bug?
- Ensure the bug was not already reported by searching on GitHub under Issues.
Please pay special attention to:
- Known compatibility issues with pdf2zh - see #20 for details
-
Reported edge cases and limitations from downstream applications - see #23 for discussion
-
If you're unable to find an open issue addressing the problem, open a new one. Be sure to include a title and clear description, as much relevant information as possible.
If you wish to request changes or new features
- Suggest your change in the Issues section.
If you wish to add more translators
- This project is not intended for direct end-user use, and the supported translators are mainly for debugging purposes. Unless it clearly helps with development and debugging, PRs for directly adding translators will not be accepted.
- You can directly use PDFMathTranslate to get support for more translators.
If you wish to contribute to YADT
Tip
If you have any questions about the source code or related matters, please contact the maintainer at aw@funstory.ai .
You can also raise questions in Issues.
You can contact the maintainers in the pdf2zh discussion group.
We welcome pull requests and will review your contributions.
- Fork this repository and clone it locally.
- Use
doc/deploy.sh
to set up the development environment. - Create a new branch and make code changes on that branch.
git checkout -b feature/<feature-name>
-
Perform development and ensure the code meets the requirements.
-
Commit your changes to your new branch.
-
Push to your repository:
git push origin feature/<feature-name>
. -
Create a PR on GitHub and provide a detailed description.
-
Ensure all automated checks pass.
Basic Requirements
Workflow
-
Please create a fork on the main branch and develop on the forked branch.
-
When submitting a Pull Request (PR), please provide detailed descriptions of the changes.
-
If the PR fails automated checks (showing checks failed and red cross marks), please review the corresponding details and modify the submission to ensure the new PR passes automated checks.
-
Development and Testing
-
Use the
uv run yadt
command for development and testing. -
When you need print log, please use
log.debug()
to print info. DO NOT USEprint()
-
Code formatting
-
Dependency Updates
-
If new dependencies are introduced, please update the dependency list in pyproject.toml accordingly.
-
It is recommended to use the
uv add
command for adding dependencies. -
Documentation Updates
-
If new command-line options are added, please update the command-line options list in README.md accordingly.
-
Commit Messages
-
Use Conventional Commits, for example: feat(translator): add openai.
-
Coding Style
-
Please ensure submitted code follows basic coding style guidelines.
- Use pep8-naming.
- Comments should be in English.
- Follow these specific Python coding style guidelines:
a. Naming Conventions:
- Class names should use CapWords (PascalCase):
class TranslatorConfig
- Function and variable names should use snake_case:
def process_text()
,word_count = 0
- Constants should be UPPER_CASE:
MAX_RETRY_COUNT = 3
- Private attributes should start with underscore:
_internal_state
b. Code Layout:
- Use 4 spaces for indentation (no tabs)
- Maximum line length is 88 characters (compatible with black formatter)
- Add 2 blank lines before top-level classes and functions
- Add 1 blank line before class methods
- No trailing whitespace
c. Imports:
- Imports should be on separate lines:
import os\nimport sys
- Imports should be grouped in the following order:
- Standard library imports
- Related third party imports
- Local application/library specific imports
- Use absolute imports over relative imports
d. String Formatting:
- Prefer f-strings for string formatting:
f"Count: {count}"
- Use double quotes for docstrings
e. Type Hints:
- Use type hints for function arguments and return values
- Example:
def translate_text(text: str) -> str:
f. Documentation:
- All public functions and classes must have docstrings
- Use Google style for docstrings
-
Example:
The existing codebase does not comply with the above specifications in some aspects. Contributions for modifications are welcome.
How to modify the intermediate representation
The intermediate representation is described by il_version_1.rnc. Corresponding Python data classes are generated using xsdata. The files il_version_1.rng
, il_version_1.xsd
, and il_version_1.py
are auto-generated and must not be manually modified.
Format RNC file
Generate RNG, XSD and Python classes
# Generate RNG from RNC
trang babeldoc/document_il/il_version_1.rnc babeldoc/document_il/il_version_1.rng
# Generate XSD from RNC
trang babeldoc/document_il/il_version_1.rnc babeldoc/document_il/il_version_1.xsd
# Generate Python classes from XSD
xsdata generate babeldoc/document_il/il_version_1.xsd --package babeldoc.document_il