[ https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17556475#comment-17556475 ]
Tomoko Uchida edited comment on LUCENE-10557 at 6/21/22 3:11 PM: ----------------------------------------------------------------- I browsed through several JSON dumps of Jira issues. These are some observations. - It'd be easy to extract various metadata of issues (reporter id, status, created timestamp, etc.) - It'd be easy to extract all linked issue ids and sub-task ids - It'd be easy to extract all attached file URLs -- Can't estimate how many hours it will take to download all of the files - it'd be easy to extract all comments in an issue -- -Perhaps pagination is needed for issues with many comments- Comments in an issue can be retrieved all at once. - We can apply parser/converter tools to convert the jira markups to markdown -- I think this can be error-prone - It'd be cumbersome to extract GitHub PR links. The links to PRs only appear in the github bot's comments in the Work Log. On GitHub side, there are no difficulties in dealing with the APIs. - It'd be a bit tedious to work with milestones via APIs. They can't be referred to by their text. Id - text mapping is needed - -It might need some trials and errors to properly place attached files in their right place- This is not possible (we can't programatically migrate attachment files to GitHub). As for the cross-link conversion and account mapping script: - To "embed" github issue links / accounts in their right place (maybe next to the Jira issue keys / user names), we need to modify the original text. This can be tricky and the riskiest part to me. Instead of modifying the original text, we could just add some footnotes for the issues/comments - but it could considerably damage the readability. Yes it should be possible with a set of small scripts. Maybe one problem is that it'd be difficult to detect conversion errors/omissions and we can't correct them ourselves if we notice migration errors later (it seems we are not allowed to have the github token of the ASF repository). was (Author: tomoko uchida): I browsed through several JSON dumps of Jira issues. These are some observations. - It'd be easy to extract various metadata of issues (reporter id, status, created timestamp, etc.) - It'd be easy to extract all linked issue ids and sub-task ids - It'd be easy to extract all attached file URLs -- Can't estimate how many hours it will take to download all of the files - it'd be easy to extract all comments in an issue -- -Perhaps pagination is needed for issues with many comments- Comments in an issue can be retrieved all at once. - We can apply parser/converter tools to convert the jira markups to markdown -- I think this can be error-prone - It'd be cumbersome to extract GitHub PR links. The links to PRs only appear in the github bot's comments in the Work Log. On GitHub side, there are no difficulties in dealing with the APIs. - It'd be a bit tedious to work with milestones via APIs. They can't be referred to by their text. Id - text mapping is needed - It might need some trials and errors to properly place attached files in their right place As for the cross-link conversion and account mapping script: - To "embed" github issue links / accounts in their right place (maybe next to the Jira issue keys / user names), we need to modify the original text. This can be tricky and the riskiest part to me. Instead of modifying the original text, we could just add some footnotes for the issues/comments - but it could considerably damage the readability. Yes it should be possible with a set of small scripts. Maybe one problem is that it'd be difficult to detect conversion errors/omissions and we can't correct them ourselves if we notice migration errors later (it seems we are not allowed to have the github token of the ASF repository). > Migrate to GitHub issue from Jira > --------------------------------- > > Key: LUCENE-10557 > URL: https://issues.apache.org/jira/browse/LUCENE-10557 > Project: Lucene - Core > Issue Type: Sub-task > Reporter: Tomoko Uchida > Assignee: Tomoko Uchida > Priority: Major > > A few (not the majority) Apache projects already use the GitHub issue instead > of Jira. For example, > Airflow: [https://github.com/apache/airflow/issues] > BookKeeper: [https://github.com/apache/bookkeeper/issues] > So I think it'd be technically possible that we move to GitHub issue. I have > little knowledge of how to proceed with it, I'd like to discuss whether we > should migrate to it, and if so, how to smoothly handle the migration. > The major tasks would be: > * (/) Get a consensus about the migration among committers > * Choose issues that should be moved to GitHub > ** Discussion thread > [https://lists.apache.org/thread/1p3p90k5c0d4othd2ct7nj14bkrxkr12] > ** -Conclusion for now: We don't migrate any issues. Only new issues should > be opened on GitHub.- > ** Write a prototype migration script - the decision could be made on that. > Things to consider: > *** version numbers - labels or milestones? > *** add a comment/ prepend a link to the source Jira issue on github side, > *** add a comment/ prepend a link on the jira side to the new issue on > github side (for people who access jira from blogs, mailing list archives and > other sources that will have stale links), > *** convert cross-issue automatic links in comments/ descriptions (as > suggested by Robert), > *** strategy to deal with sub-issues (hierarchies), > *** maybe prefix (or postfix) the issue title on github side with the > original LUCENE-XYZ key so that it is easier to search for a particular issue > there? > *** how to deal with user IDs (author, reporter, commenters)? Do they have > to be github users? Will information about people not registered on github be > lost? > *** create an extra mapping file of old-issue-new-issue URLs for any > potential future uses. > *** what to do with issue numbers in git/svn commits? These could be > rewritten but it'd change the entire git history tree - I don't think this is > practical, while doable. > * Build the convention for issue label/milestone management > ** Do some experiments on a sandbox repository > [https://github.com/mocobeta/sandbox-lucene-10557] > ** Make documentation for metadata (label/milestone) management > * Enable Github issue on the lucene's repository > ** Raise an issue on INFRA > ** (Create an issue-only private repository for sensitive issues if it's > needed and allowed) > ** Set a mail hook to > [issues@lucene.apache.org|mailto:issues@lucene.apache.org] (many thanks to > the general mail group name) > * Set a schedule for migration > ** Give some time to committers to play around with issues/labels/milestones > before the actual migration > ** Make an announcement on the mail lists > ** Show some text messages when opening a new Jira issue (in issue template?) -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org