Translating the Aruch HaShulchan with AI: Method and Lessons
Translating the Aruch HaShulchan with AI: Method and Lessons[1]
Joshua Broyde (PhD)
Joshua Broyde is an AI/ML Healthcare & Life Sciences specialist at Google Cloud, where he works from strategy to implementation with enterprise medtech, healthcare, and biopharma companies to design, build, and deploy secure, production-grade AI and Generative AI systems. He has a PhD in the field of Computational Biology.
Abstract
I present a complete AI-written English translation of the Aruch HaShulchan — Rabbi Yechiel Michel Epstein’s comprehensive nineteenth-century code of Jewish law — covering all four sections of the Shulchan Aruch: Orach Chaim, Yoreh De’ah, Even HaEzer, and Choshen Mishpat. The translation was produced using Google’s Gemini 3 Flash model at its MINIMAL thinking level. Translation quality is consistently high for a first-pass working text, although it clearly sometimes makes mistakes. The full translation is freely available online at aruch-hashulchan-transla-53666.web.app, and the translation data and code are available at github.com/JoshB29/Texts_AI_Translations.
Introduction
The Aruch HaShulchan, by Rabbi Yechiel Michel Epstein (1829-1908) of Novardok, is one of the most authoritative and comprehensive codes of Jewish law in the modern era. Written in rabbinic Hebrew, it has remained largely inaccessible to English readers outside of a handful of translated sections.[2]
AI-driven translation represents a qualitative shift in what is possible for rabbinic literature. This does not fully replace careful human translation, but it creates an opportunity to scale massively: human scholars can now function primarily as editors, reviewing and refining AI-generated drafts rather than translating from scratch. A multi-year scholarly translation project can now be approximated in hours for a few dozen dollars.
Corpus
The Aruch HaShulchan comprises four sections (i.e. chelakim) corresponding to the four parts of the Shulchan Aruch: Orach Chaim, Yoreh De’ah, Even HaEzer and Choshen Mishpat. I first obtained the Hebrew source text from Sefaria[3]. In total, this corresponds to 1,507 Simanim (chapters) and comprises 24,161 seifim (subsections) and around 3,400,000 words.
Translation Method
Model and Thinking Level
I used Google’s Gemini 3 Flash (gemini-3-flash-preview) to create the translation[4]. Gemini Flash supports four “thinking levels” — MINIMAL, LOW, MEDIUM, HIGH — which trade computation budget against reasoning depth. I chose to use the MINIMAL setting, which created a good translation at a very low cost.
Each seif was passed as an isolated API call[5] with a fixed system prompt containing[6]:
- A style guide specifying conventions for rabbinic titles (e.g., Rambam, Rema, Beit Yosef), standard formulaic phrases (e.g., עד כאן לשונו → “Thus far his words”), and transliteration conventions (siman, seif, seif katan).
- Few-shot examples illustrating the original text and style of translation.
- An instruction to preserve the paragraph and logical structure of the original.
In order to make the translations more consistent with each other, I applied a post-processing normalization to all translated seifim after translation was complete.
Choosing the Model Thinking Level to Use
In order to understand which model thinking level is best for rabbinic translations, I chose 5 randomly selected seifim from each chelek, and had Gemini Flash translate them at the different thinking levels. Below are the results:

I found the MINIMAL setting to produce translations of roughly equivalent quality[7] compared to the higher thinking models. The additional expense of higher thinking did not seem to create significantly better translations. You can see from the graph that the cost for each seif at the HIGH thinking level is around 5 cents, about 33 times the MINIMAL thinking level[8]. This would represent a total cost of about $1300, as opposed to only $40 I spent using the MINIMAL setting[9]. Most of this cost is due to the significant amount of thinking tokens used by the high model. These results would also suggest that Gemini’s larger model, Gemini-Pro, would be overkill for the project. However, I did not formally test Gemini Pro, or any other model family[10].
Post-Processing Results
One thing I noticed was inconsistency of the word *seif* and the names of certain rabbinic works. Specifically, *seif* was sometimes kept in transliteration and sometimes translated as “section.” To address this, I applied a post-processing step using regular expressions to standardize these renderings: *seif* was normalized to “section” and *seif katan* to “subsection.” I applied the same approach to certain rabbinic names — for example, the *Shach* was sometimes transliterated as “Shakh” — normalizing these to a single canonical spelling throughout the corpus. For future translations, this should be fixed with better prompting.
Despite the fact that each seif is translated separately, my overall impression is that the translations were reasonably good and consistent.
Translation Quality
To more systematically validate the translation quality, I selected 100 random seifim and had a separate model evaluate the translations[11]; specifically, I used OpenAI’s GPT-5.4 model with a low reasoning effort. Each of the 100 seifim was passed to the model together with its original Hebrew, and the model was asked to classify the translation into one of four categories: Good, StyleImprovementNeeded, Mistake, or EgregiousError.
The egregious error category was defined strictly: a translation qualifies as egregious if it fundamentally inverts or distorts the author’s meaning — even if the error itself is subtle. To illustrate this, the following example was included in the evaluation prompt:
Original Hebrew
OC 635:5:
ובוודאי אין זה רק זהירות בעלמא, דאיזה פסול יש בעשיית הגג קודם הדפנות? דאין זה דמיון לחוטט בגדיש, שאין על זה שם גג כלל. אבל הכא גג גמור הוא.
Hypothetical mistranslation:
And certainly this is not only a mere stringency, for what disqualification is there in making the roof before the walls? For this is not comparable to hollowing out a haystack, where it does not have the status of a roof at all. But here, it is a complete roof.
The Aruch HaShulchan is in fact saying the opposite – the Rema’s statement is only a mere stringency. The mistranslation — inserting the words “not only” — is superficially consistent with the Hebrew, but is inconsistent with the actual context of the text and completely inverts the intended meaning.[12]
I had ran the analysis 3 times, and took a majority vote to determine the classification of the seif. If there was a 3 way-tie, the most serious category was taken. Here are the results:
| Classification | Percentage |
| Good | 29% |
| StyleImprovementNeeded | 54% |
| Mistake | 17% |
| EgregiousError | 0% |
You can see that the GPT model thinks that the translations overall need to be improved in style, and found 17% of the translations had a mistake. Importantly though, there were no egregious errors found.
Most of the mistakes were mistranslations of specific words/phrases or incorrect abbreviation expansions; some of the mistakes are a bit humorous. Here are a few examples:
[עח”ד סק”ה שכתב דדווקא ברואה מ”ת אמרינן כן משום די”ל שמש עכרן אבל בשלא מחמש (sic) תשמיש …]
Translation: [see Chavot Daat subsection 5 who wrote that specifically regarding one who sees from the Torah…]
However, מ”ת means “due to sexual intercourse”. The model seems to have incorrectly expanded this to “מן התורה׳”, despite that making no sense in context.[13]
כל צרכי ציבור שאינם יכולים להשוות עצמן מפני שינוי דיעות יש להושיב כל הבע”ב הנותנים מס ויקבלו עליהם ברכה שכל אחד יאמר דעתו לש”ש וילכו אחר רוב דיעות
Translation: Regarding all communal needs where they are unable to reach an agreement due to a difference of opinions, they should seat all the householders who pay tax and they shall accept upon themselves a blessing that each one will state his opinion for the sake of Heaven, and they shall follow the majority of opinions.
This is flagged as a mistake by GPT- “blessing” is not the right translation in this context; admonition or a similar word would be better.
ושתיית יין שרוף מעט – וודאי מגרר גריר.
Translation: And drinking a small amount of burnt wine – certainly stimulates.
“Burnt wine” is a clear mistake for distilled alcohol.
Here’s a few more examples that I found of clear mistakes or stylistic quirks:
- In OC siman 475, seif 16, the abbreviation “עק”נ” is incorrectly rendered as “see Ketzot HaChoshen” — the correct translation should be “see Korban Netanel”.
- In YD siman 19, seif 6, the word “poskekim” appears twice — a word that does not exist in Hebrew or in standard transliteration. It appears to be a model-generated corruption of poskim (the standard plural of posek, a halachic decisor). The model occasionally makes up this word a few times in its translation.
- In EH, Siman 1 seif 2, האשה אינה מצווה על פריה ורביה is translated as “The woman is not commanded regarding pruz u-revu”. This should be “The woman is not commanded regarding pru-urevu” or even better would be “Women are not commanded in pru-urevu”.
- CM 24:1 “רבותינו ז”ל אמרו [רפ”ה לב”ק] שאין נזקקין אלא לתובע תחלה “ “Our sages, of blessed memory, said [at the beginning of the eighth chapter of Bava Kamma]”. This should be “At the beginning of the fifth chapter”
My overall conclusion is that the translation overall avoided serious errors, but that it does occasionally mistranslate more complex words or phrases, and will sometimes incorrectly expand abbreviations. The model is also no Shakespeare in terms of prose, and is sometimes stylistically awkward. Still, it is mostly correct and I think it is a useful resource. Not bad for $40.73.
Conclusion and Lessons for Future Translations
Based on my experience on using AI to translate the Aruch HaShulchan, I offer the following recommendations for those undertaking similar large-scale AI translation projects in rabbinic or other specialized religious and legal literature.
Invest in the system prompt. The single highest-leverage action is writing a strong system prompt. The prompt should do two distinct things: (1) establish the content guidance — what the model is translating, from what tradition, at what level of expertise; and (2) establish stylistic guidance — how specific terms, titles, phrases, and citation styles should be rendered, consistently and unambiguously. The prompt I used included a style guide with explicit rules for rabbinic titles, standard formulaic phrases, and transliteration conventions.
Few-shot examples are highly effective. I found that including two or three worked examples in the prompt — a Hebrew passage followed by its correct English rendering — substantially improved consistency and register. The model learns from examples what it cannot fully internalize from rules alone, particularly for the tone and sentence structure expected in halachic prose. In the prompt I used, the few shots were not even entire seifim and were actually quasi-synthetic quotes; they were meant simply to show the style of source and target text.
Smaller, lightweight models are sufficient. There is no need to use the largest or most expensive models for this class of task. LLMs have essentially solved the problem of rabbinic Hebrew translation at the level of legal accuracy. Gemini 3 Flash at MINIMAL thinking — the smallest and cheapest configuration I tested — produced translations that were as good as those produced by higher thinking levels at 33 times the cost. The marginal quality improvement from more compute is real but small; the marginal cost increase is enormous.[14]
Prompt Iteration is essential. It took a few tries to get the model to translate in the way I wanted, and running the model on a few seifim, looking at the translation, and adjusting the prompt was very helpful. This was especially helpful for certain expressions I wanted translated consistently.
Context improves consistency, but potentially at significant cost. The primary weakness of this pipeline is that each seif is translated in complete isolation: the model has no memory of the spelling choices, transliteration decisions, or citation delimiter styles it used in prior seifim. Including previous seifim as context — either by passing the already-translated text of the current siman, or through a more dynamic agent-based RAG system — would improve within-siman consistency meaningfully. However, because halachic terminology and cross-references can reach back not just to the previous seif but to earlier simanim and even different sections of the work, the amount of context needed to capture these dependencies could quickly become expensive. For a first-pass translation, I found the isolated-call approach worked remarkably well. For a definitive scholarly edition, context-aware translation is the clear next step.
Conclusion
Translating the complete Aruch HaShulchan — approximately 24,000 seifim across four major sections — is technically feasible, inexpensive ($40.73), and fast (taking a few days on my personal Mac) using current AI technology. The resulting translation is of high quality for a working text, albeit with clear and addressable weaknesses that can be addressed in future iterations of the translation. The full translation is freely available at aruch-hashulchan-transla-53666.web.app.
Appendix A: Example Gemini translations at different thinking levels
# Orach Chaim 475:16 — Thinking Level Comparison
## Hebrew Source
וכתב רבינו הבית יוסף בסעיף ד: אכל מצה בלא כוונה, כגון שאנסוהו אנסים או לסטים לאכול – יצא ידי חובתו, כיון שהוא יודע שהלילה פסח, ושהוא חייב באכילת מצה. אבל אם היה סבור שהוא חול, או שאין זה מצה – לא יצא. עד כאן לשונו. סברא זו כתב כדי לתרץ דברי הרמב”ם, שכאן פסק דיצא ובפרק שני משופר פסק דבלא מתכוין – לא יצא, לחלק בין אכילה לשאר דבר. ומיהו גם באכילה, אם לא ידע שפסח היום או שזהו מצה – וודאי לא יצא, דמצות צריכות כוונה. ובאכילה כשידע, אף שנעשה באונס – הוה ככוונה. וסברא זו הוזכרה בתוספות פסחים (קטו א). אבל מדברי הרא”ש שם, והטור בכאן, לא משמע כן, עיין שם. (עק”נ שם אות ק’, שהאריך בזה.)
—
## Minimal Thinking
Our teacher, the Beit Yosef, wrote in section 4: If one ate matzah without intent, for example, if oppressors or bandits forced him to eat—he has fulfilled his obligation, since he knows that the night is Pesach and that he is obligated in the eating of matzah. But if he thought it was a weekday, or that this is not matzah—he has not fulfilled it. Thus far his words. He wrote this reasoning in order to reconcile the words of the Rambam, who ruled here that he has fulfilled it, yet in the second chapter of Shofar ruled that without intent—one has not fulfilled it, to distinguish between eating and other matters. However, even regarding eating, if he did not know that today is Pesach or that this is matzah—he certainly has not fulfilled it, for mitzvot require intent. And regarding eating, when he knew, even though it was done under duress—it is like intent. This reasoning was mentioned in the Tosafot in Pesachim (115a). But from the words of the Rosh there, and the Tur here, it does not appear so, see there. (See Ketzot HaChoshen there, letter 100, who elaborated on this.)
—
## Low Thinking
And our teacher, the Beit Yosef, wrote in section 4: If one ate matzah without intent, such as if oppressors or bandits coerced him to eat—he has fulfilled his obligation, since he knows that tonight is Pesach, and that he is obligated in the eating of matzah. But if he thought it was a weekday, or that this is not matzah—he has not fulfilled [his obligation]. Thus far his words. He wrote this reasoning in order to reconcile the words of the Rambam, for here he ruled that he has fulfilled [his obligation], and in the second chapter of Shofar he ruled that without intent—he has not fulfilled [it], to distinguish between eating and other things. However, even regarding eating, if he did not know that today is Pesach or that this is matzah—certainly he has not fulfilled [it], for mitzvot require intent. And regarding eating, when he knew, even though it was done under duress—it is like intent. And this reasoning was mentioned in Tosafot Pesachim (115a). But from the words of the Rosh there, and the Tur here, it does not appear so; see there. (See Korban Netanel there, letter 100, who elaborated on this.)
—
## Medium Thinking
Our teacher, the Beit Yosef, wrote in section 4: If one ate matzah without kavanah, such as if coercers or bandits forced him to eat—he has fulfilled his obligation, since he knows that tonight is Pesach and that he is obligated in the eating of matzah. But if he thought it was chol, or that this was not matzah—he has not fulfilled his obligation. Thus far his words. He wrote this reasoning in order to reconcile the words of the Rambam, for here he ruled that he has fulfilled his obligation, and in the second chapter of Shofar he ruled that without kavanah—he has not fulfilled his obligation, to distinguish between eating and other things. However, even regarding eating, if he did not know that today is Pesach or that this is matzah—certainly he has not fulfilled his obligation, for mitzvot require kavanah. And regarding eating, when he knew, even though it was done under coercion—it is like kavanah. And this reasoning was mentioned in Tosafot Pesachim (115a). But from the words of the Rosh, ibid., and the Tur here, it does not appear so; see there. (See Korban Netanel, ibid., letter 100, who elaborated on this.)
—
## High Thinking
And our teacher, the Beit Yosef, wrote in seif 4: If one ate matzah without intent, for example, if oppressors or bandits forced him to eat—he has fulfilled his obligation, since he knows that tonight is Pesach and that he is obligated in the eating of matzah. But if he thought it was a weekday, or that this is not matzah—he has not fulfilled. Thus far his words. He wrote this reasoning in order to reconcile the words of the Rambam, for here he ruled that he has fulfilled, and in the second chapter of Shofar he ruled that without intent—he has not fulfilled, to distinguish between eating and other matters. However, even regarding eating, if he did not know that today is Pesach or that this is matzah—certainly he has not fulfilled, for mitzvot require intent. And regarding eating, when he knows, even though it was done under duress—it is like intent. And this reasoning was mentioned in Tosafot Pesachim (115a). But from the words of the Rosh ibid., and the Tur here, it does not appear so; see there. (See Korban Netanel ibid., letter 100, who elaborated on this.)
These translations are substantially the same, but have a few differences:
The minimal thinking omits the initial “and”, which is technically an error but does not impact the meaning. The most glaring error is that the minimal translation expands עק”נ as Ketzot HaChoshen there, whereas the more advanced models correctly expand this to Korban Netanel.
Appendix B: Some Sample Translations
# Aruch HaShulchan — Translation Samples
## OC 67:2 — Doubt about Keriat Shema
**Hebrew**
ופשוט הוא דמיירי שיש כאן ספק על הכל: על הקריאת שמע ועל הברכות. אבל אם הספק הוא רק על קריאת שמע, והברכות וודאי קרא – פשיטא שאומר הקריאת שמע ולא הברכות, דהברכות אינם שייכים לקריאת שמע כמו שכתבתי בסימן נ”ט. אך דיש לומר דאם פשיטא ליה דאמר “אמת ויציב” – גם קריאת שמע אינו חוזר וקורא, דוודאי קרא גם קריאת שמע דסירכיה נקיט ואתי. כמו שכתבתי בסימן ס”ד, דכשפתח “למען ירבו” אמרינן דוודאי אמר “והיה אם שמע” דסרכיה נקיט, עיין שם. והכא נמי כן הוא. ואם ברי לו שברכות שלפני קריאת שמע אמר, והספק הוא מקריאת שמע והברכות שלאחריהן – קורא קריאת שמע והברכות שלאחריהן.
**Translation**
And it is simple that this refers to a case where there is a doubt regarding everything: regarding the Keriat Shema and regarding the blessings. But if the doubt is only regarding the Keriat Shema, and the blessings he certainly recited—it is simple that he recites the Keriat Shema and not the blessings, for the blessings are not connected to the Keriat Shema, as I wrote in siman 59. However, one could say that if it is certain to him that he recited “Emet VeYatziv”—he also does not return and recite Keriat Shema, for he certainly also recited Keriat Shema, as he follows his habit. As I wrote in siman 64, that when he began “Lema’an Yirbu” we say that he certainly recited “VeHayah Im Shamoa” because he follows his habit, see there. And here too it is so. And if it is clear to him that he recited the blessings before Keriat Shema, and the doubt is regarding the Keriat Shema and the blessings after them—he recites Keriat Shema and the blessings after them.
<## OC 292:1 — Mincha on Shabbat: Ashrei and Va’ani Tefilati
**Hebrew**
אומרים ‘אשרי’ ‘ובא לציון’, דבכל יום אומרים אותו בשחרית, ובשבת שבלאו הכי יש אריכות בשחרית – הניחוהו למנחה. ואומרים חצי קדיש. ואחר כך אומרים פסוק ‘ואני תפלתי’, על פי המדרש שמקודם כתיב: ‘ישיחו בי יושבי שער ונגינות שותי שכר’, אמר דוד לפני הקב”ה: אף על פי ששתינו – ואני תפלתי לך ד’. כלומר דשארי בני אדם שותים ומשתכרים ועוסקים בהוללות, ואני כשאשתה – עוסק אני אחר כך בתורה ותפ(י)לה. ולכן ביום טוב דליכא קריאת התורה – אין אומרים אותו, ועוד דעת רצון הוא רק בשבת במנחה, ונקרא בזוהר (שמות דף פ”ח:) ‘רעוא דרעוין’, וזהו רק בשבת כידוע. ולכן בשבת אף כשמתפלל ביחידות או במקום שאין ספר תורה – מכל מקום אומרים אותו (עיין מג”א סק”א), וכן הוא בזוהר תרומה.
**Translation**
We say ‘Ashrei’ and ‘Uva LeTziyon’, for on every day we say them in Shacharit, and on Shabbat, since in any case there is length in Shacharit—they left it for Mincha. And we say a half Kaddish. And afterwards we say the verse ‘Va’ani Tefilati’, based on the Midrash that previously it is written: ‘Those who sit in the gate talk about me, and I am the song of drunkards,’ David said before the Holy One, Blessed be He: Even though we have drunk—’as for me, may my prayer be unto You, Hashem.’ That is to say, that other people drink and become intoxicated and engage in revelry, but I, when I drink—I engage afterwards in Torah and prayer. Therefore, on Yom Tov, when there is no reading of the Torah—we do not say it; and furthermore, the time of favor is only on Shabbat at Mincha, and it is called in the Zohar (Shemot page 88b) ‘the favor of favors’, and this is only on Shabbat as is known. Therefore, on Shabbat, even when one prays in private or in a place where there is no Sefer Torah—nevertheless we say it (see Magen Avraham subsection 1), and so it is in Zohar Terumah.
—
## OC 615:1 — Marital Relations on Yom Kippur
**Hebrew**
יום הכיפורים אסור בתשמיש המטה, וגם זה בכלל “ועניתם”, דמניעת תשמיש מקרי “עינוי”. ובגמרא (עז ב) יליף זה מקרא, עיין שם. ואפילו במקום שאין טובלין לקרי – אסור התשמיש (טור), כלומר: דלא תימא דהאיסור הוא משום הטבילה שאחר כך, מפני איסור רחיצה, אלא התשמיש עצמו אסור. ולכן יתרחק אדם מאשתו כל יום זה כמנידה. והיינו: שלא יישן עמה במיטה אחת, ולא יגע בה בלילה. אבל ביום אין חשש כמובן (ט”ז). ויש שאוסר גם ביום (מגן אברהם), וכמדומה שאין המנהג כן. ולא ירבה לדבר עמה (שם).
**Translation**
Yom Kippur is forbidden for tashmish hamitah, and this too is included in “and you shall afflict yourselves,” for the withholding of tashmish is called “affliction.” And in the Gemara (77b) this is derived from a verse, see there. And even in a place where they do not immerse for keri—tashmish is forbidden (Tur), that is to say: so that you should not say that the prohibition is because of the immersion that follows, due to the prohibition of washing, but rather the tashmish itself is forbidden. Therefore, a man should distance himself from his wife all this day as if she were a niddah. And that is: that he should not sleep with her in one bed, and he should not touch her at night. But during the day there is no concern, of course (Taz). And there is one who forbids even during the day (Magen Avraham), and it seems that the custom is not so. And he should not speak extensively with her (ibid.).
—
## YD 38:13 — The Appearance of a Lung
**Hebrew**
ולפע”ד היה נראה להשוות הדיעות במקצת דבאמת אמראה בשר קשה מאד הרי זו אדום ולכן כתב אחד מהגדולים שצריך להבחין באדמימות אם אינה נוטה למראה בשר [שמ”ח סעי’ ד’] אבל מי הוא האיש החכם ויבן את זאת ולכן נראה דזה שכתבו כמראה בשר לאו משום המראה של אדמימות אלא גם על המראה העצמיות שלה דידוע דריאה היא חלקה מאד ולהיפך מזה הבשר אינו חלק כלל שהרי נמשכת לחוטים דקים ולכן אם הרואה לעינים לא יאמר עליה שזו היא ריאה אלא בשר היא טרפה ואם יאמר הרואה שזהו כבד כשרה וה”ה אם יאמר עליה שזהו טחול ג”כ כשר ולפ”ז זה שאמרו כמראה בשר טרפה לאו משום המראה אדמומית אלא משום מראה העצם שגשמיותה נראית כבשר [ומתורץ בזה קושית המאור למה לא אמר הש”ס כעין בישרא ע”ש]:
**Translation**
In my humble opinion, it would seem possible to reconcile the opinions somewhat, for indeed, regarding the appearance of meat, it is very difficult, as this is red. Therefore, one of the great authorities wrote that one must discern in the redness if it does not lean toward the appearance of meat [Siman 38, section 4]. But who is the wise man who can understand this? Therefore, it seems that what they wrote, “like the appearance of meat,” is not because of the appearance of redness, but also regarding its essential appearance. For it is known that a lung is very smooth, and conversely, meat is not smooth at all, as it is drawn into thin fibers. Therefore, if the observer would not say of it that this is a lung, but rather that it is meat, it is terefah. But if the observer would say that this is liver, it is kosher; and the same law applies if he would say of it that this is a spleen, it is also kosher. According to this, that which they said, “like the appearance of meat is terefah,” is not because of the appearance of redness, but because of the appearance of the substance, that its physical essence appears like meat [and with this, the difficulty of the Maor is resolved as to why the Shas did not say “like meat,” see there].
—
## CM 378:13 — Both Parties With or Without Permission
**Hebrew**
ומה נקרא שניהם ברשות כגון חצר השותפים או שנתן הבעה”ב רשות לשניהם ליכנס לחצרו או שהניזק נתן רשות למזיק ליכנס לחצרו או המזיק לניזק וכן כששניהם הלכו ברה”ר או בסימטא מקרי שניהם ברשות דהלא לשניהם יש רשות להלוך אבל כשרצים ברה”ר ובסימטא מקרי שניהם שלא ברשות דאין רשות לאדם לרוץ במקום הילוך רבים ושלא במקום הילוך רבים כמו בשדה ובקעה מקרי ברשות גם כשרצים [נ”ל] וכן בחצר שאינו של שניהם ובעל החצר לא נתן להם רשות מקרי שלא ברשות אבל כשלאחד נתן רשות ולא להשני הוי לזה שנתן רשות כחצרו:
**Translation**
And what is called “both are with permission”? For example, a partners’ courtyard, or where the homeowner gave permission to both of them to enter his courtyard, or where the injured party gave permission to the damager to enter his courtyard, or the damager to the injured party. And similarly, when both were walking in a public domain or in a side-alley, it is called “both are with permission,” for behold, both have permission to walk. But when they are running in a public domain or in a side-alley, it is called “both are without permission,” for a person does not have permission to run in a place where many walk. And in a place that is not a place where many walk, such as in a field or a valley, it is called “with permission” even when they run [it appears to me]. And similarly, in a courtyard that does not belong to both of them and the owner of the courtyard did not give them permission, it is called “without permission.” But when he gave permission to one and not to the other, it is for the one to whom he gave permission like his own courtyard.
- Special thanks to Emory University’s Law and Religion Center, which funded the model calls for this project. I want to thank my father, Rabbi Michael Broyde, for his thoughts on this piece and general helpful discussions when engaging in this project. His work on the Aruch Hashulchan is what inspired this project in the first place. I thank him also for all the Torah he has taught me and continues to teach me. Thank you also to Sefaria for publishing the original Hebrew text. ↑
- My father, Rabbi Michael Broyde, undertook a scholarly translation of the Hilchos Shabbos chapters of the *Aruch HaShulchan* (*Orach Chaim* 242–292, see here). Aryeh Press is separately producing AI-assisted translations of the Aruch HaShulchan that integrates commentary from the Mishneh Brurah and the Shulchan Aruch HaRav — a more ambitious and editorially intensive project targeting the same text. ↑
- For Yoreh Deah, I had to combine two different files from Sefaria: Arukh HaShulchan – he – Arukh HaShulchan, Yoreh De’ah — Wikisource.json and Arukh HaShulchan – he – Aruch HaShulchan, Vilna 1923-29.json ↑
- Previously, in September 2024, I used GPT-4o to translate the Aruch HaShulchan, covering Orach Chaim, Yoreh De’ah, and Even HaEzer. You can find that older translation here. I found that it also produced translations of good quality. I have since updated the translation using Gemini 3 Flash, and extended coverage to all four sections of the Shulchan Aruch. ↑
- All translation pipeline code — including the API client, the translation script, the thinking-level experiment, the post-processing normalization pass, and the code powering the website — was written with Claude Code (Anthropic’s AI coding assistant). Claude Code also assisted with some qualitative analysis of the translations and assisted drafting this article. ↑
- You can find the code for creating the translation, including prompts for the model, here. All model API calls had a temperature of .4. ↑
- See appendix for an example. You can see the full results of the experiment here. ↑
- In the translation, I found that the MINIMAL setting never did any thinking. However, this is not guaranteed, and the minimal thinking setting sometimes can engage in thinking. ↑
- Even this higher rate remains significantly lower than professional human translation, which typically ranges from $0.05 to $0.20 per word. Even at the lower estimate of $0.05 per word, a professional translation of the Aruch HaShulchan would cost approximately $170,000, making the AI-driven alternative a fraction of the market rate. ↑
- That being said, my impression with informally testing with Anthropic’s Claude model and OpenAI’s ChatGPT are that they generated similarly reasonable translations. As I stated above, even the older GPT4o model did a solid job of translating sections of the Aruch Hashulchan. ↑
- I chose a totally different model to avoid any potential bias of Gemini both creating the translations and also evaluating. However, in principle, LLMs of the same family can be used to evaluate their own responses. ↑
- The actual Gemini generated translation correctly translates this phrase as “And certainly this is only a mere stringency”. I want to thank Claude Code for accidentally surfacing this nice example. Claude Code originally flagged this correct translation as being incorrect and I had to fight with it to persuade it that in fact the original translation was correct(!) ↑
- My impression is that about a third or a half of all mistakes are incorrectly expanded abbreviations. ↑
- For true publication quality translations, I think more work needs to be done to test whether smaller models can be used, especially if those translations will incorporate auto-generated commentaries and cross-references. ↑


19 thoughts on “Translating the Aruch HaShulchan with AI: Method and Lessons”
I have an early version of a Claude Code translation pipeline I developed in a public Github repo. I have translated many books.
https://github.com/sweisman/translation-pipeline
Give it a PDF of scanned images, like from HebrewBooks or Internet Archive, and it converts to an electronically readable document. No need for Sefaria as a text source.
I also experienced the same problems you had with terminology and name shifting. I also started with a post-processing step, but later shifted to a carry-forward automatically-generated glossary, and inject the end of one page batch to the beginning of the next, to maintain consistent context. But you need Claude Code (or ChatGPT Codex) to do this.
Yes, the glossary approach is a good one approach, and could probably be done efficiently by programmatically selecting which terms need to be injected from the glossary, rather than having an extremely large glossary being used each time. An even more dynamic approach would be to have the LLM generate not just the translation but new terms that are not in the glossary. These can then be appended to the glossary.
If this is done via API, it can be done with any model, not just Claude Code or ChatGPT codex.
The glossary works that way. There are several glossaries, actually. I don’t have to prefill anything. The LLM does it all automatically. And it uses an LRU algorithm to avoid getting too big. None of this is in my public repo.
Good point about API usage. But Claude Code as a harness is quite excellent. I haven’t used Codex myself, but ChatGPT says it can run such a pipeline too. Although it concedes that Claude’s Vision capability is superior to its own image processing for my use case, for now.
One area where this pipeline has turned out to be unexpectedly useful is medieval scientific manuscripts.
Over the last few weeks I’ve been using it on Ralbag’s *Sefer HaTechunah* (“Book of Astronomy”), the astronomical section of *Milhamot HaShem*. The complete work has remained largely inaccessible outside manuscript form; only a small portion was ever published academically. I’m currently working directly from a 1397 Sephardic manuscript from the BnF.
Current results are probably in the ~60–70% usable first-pass range before human correction, depending on manuscript quality and subject matter — but importantly, that’s already enough to recover actionable structure, argument flow, terminology, and mathematical content from texts that were previously effectively unreadable without specialist effort. I suspect further pipeline improvements can push this substantially higher.
One especially interesting section contains a solar-distance derivation unlike the standard classical methods. Instead of relying directly on solar parallax, the method derives the changing Earth-Sun distance from seasonal variations in the Sun’s apparent motion and apparent diameter. It’s a genuine mathematical chidush, and appears largely absent from modern histories of astronomy. What makes it especially interesting is that, unlike most later solar-distance methods, the observational requirements are potentially within reach of independent observers using modern and affordable optical equipment.
Feel free to plug this entire comment into any modern AI model and ask it to evaluate the historical and mathematical claims independently. The responses have been surprisingly interesting.
Since you clearly have both the technical and Hebrew background for this kind of work, I figured there might be overlap in interests. If you’d ever like to discuss manuscript translation workflows or possibly collaborate privately on some of these texts, feel free to reach out.
https://github.com/sweisman/translation-pipeline
How can I contact you Shimshon? I want to talk about your astronomical calculations.
Wow I am embarrassed. Haha the joke is on me. I was wrong. But it is still genuinely cool stuff and if you want to collaborate on anything I’d be interested.
Wow I am embarrassed. Haha the joke is on me. I was wrong. A little bit.
I didn’t discover something truly new. Still, this solar distance calculation technique was only documented recently, dating back to the 1980s. It is still unknown to this day, because it was only published in obscure astronomy history publications, and never penetrated a wider audience. It’s a valid technique, within reach of somewhat modest means. This makes my rediscovery of it useful. It is an interesting and genuine chidush that deserves more attention.
Not sure how this measuring scheme works. Measuring variation of apparent solar diameter will only tell you something about relative distance, parallax is still required for absolute distance.
How many hours did all this take you, all the time from beginning to end? And how much programming knowledge is necessary to make it all work?
The whole project took perhaps 30 hours, including the experimentation and benchmarking. Claude Code itself wrote the code to do this, so a technical background (e.g. the ability to write code by hand) is not needed like in the past. However, a basic understanding of how code works still is useful.
The main thinking that has to be done is around the prompts, and also the key design choices (which model to use, what does the model see when translating, etc.). These are very important for a project like this.
Interesting article.
I have only used AI to translate short texts recently and also found the results satisfactory- at least for personal use. I agree with you that much of the labor is now in proof-reading.
I was surprised that the AI blundered יין שרוף. Could it be it’s because יין שרף is more frequently used? I searched Sefaria and got the following results:
33 -יין שרוף
203-יין שרף
I am actually not sure that most of the labor will be in in proof-reading shortly. I think that better models and models that translate with more data will produce better and better translations, and proof-readers will probably only look at sections where the AI itself flags the translation as needing review.
What I actually find even more weird are the mistakes that appear to be typos, such as “pruz u-revu” where it hallucinates a zayin. I am not sure where those come from either!
In general, I’m finding that this translation is very literal. For this particular example, you could be right regarding the rarity of the term, since it is less likely to know the answers. I assume a more advanced model would get the translation correct, albeit at higher cost.
Good article. Much too technical for me, but you do a good job of bringing home how relatively easy it is (for technical-minded people!) to create translations of texts, for virtually pennies and in virtual minutes. You note the caveats, but in my opinion translations should always be used alongside the original, because they are an aid, not a replacement. Thus, the errors caused by AI – and there are always errors, even when done by men – are not a *psul* in the overall project.
Thanks! The honest truth is that these projects are becoming less and less technical. Since AI writes the code, We are entering an era where people can do complex projects and essentially act as a manager, with the AI writing the code. As I gets better and better, I anticipate this will accelerate even more.
Interesting post. One thing I’m confused about, based on this
> inexpensive ($40.73), and fast (taking a few days on my personal Mac)
So was this using a cloud provider or local?
P.S. I remember you from KBY. Nice to see what you’re working on these days 🙂
Nice to cross paths again Yehoshua! I remember you from KBY as well!
This was all using Gemini, so using Google and thus on Cloud. The reason it took so long is that my script still processes each seif one at a time. That, plus the quota limits where I had to wait for quota refresh, make it took a few days.
This was all very interesting. Just as an aside, I copied and pasted the Sefaria text of the first sif of the Urim Publications translation (OC 274:1) into ChatGPT (free account), and compared that against your AI Translation and against the Urim Publications translation, asking ChatGPT to compare these two English translations first against the Sefaria text, and then against each other as English translations. I provided no other prompt criteria or context. Here are the results:
This is a useful comparison because the differences are subtle but reveal different translation philosophies.
## 1. Comparison against the Hebrew
The Hebrew is from the Aruch HaShulchan (O.C. 274:1), and is written in a fairly straightforward nineteenth-century rabbinic style.
### Opening citation
**Hebrew**
> גרסינן בשבת (קי”ז:)…
Literally:
> “We have the following text in Shabbat 117b”
**AI translation**
> We learn in Shabbat (117b)
**Urim**
> The Gemara Shabbat (117b) reads
**Assessment:** Urim is more precise. “גרסינן” literally means “we read/have the text,” not “we learn.”
—
### חייב אדם לבצוע על שתי ככרות
Literally:
> A person is obligated to break bread over two loaves.
**AI**
> obligated to break bread over two loaves
**Urim**
> obligated to break bread with two loaves
**Assessment:** AI is closer. “לבצוע על” means making the hamotzi/betziah over two loaves, not merely “with” two loaves.
—
### והך לחם משנה מיותר
Literally:
> and this phrase “lechem mishneh” is superfluous.
**AI**
> this “double bread” is superfluous
**Urim**
> the words “double bread” are redundant
Both are fine. “Superfluous” is slightly closer to מיותר.
—
### ואין זה אלא כפל דברים
Literally:
> this is nothing but duplication/repetition of words.
**AI**
> this is nothing other than a duplication of words
**Urim**
> this is mere repetition
AI is closer to the literal wording.
—
### ולזה דרשינן
Literally:
> therefore we expound/derive
**AI**
> For this reason we expound
**Urim**
> We, therefore, expound
Essentially identical.
—
### והכי פירושו
Literally:
> and thus is its meaning/explanation
**AI**
> and this is its explanation
**Urim**
> and what it means is
Both accurate.
—
### דהלחם של סעודות שבת תהיה משנה
Literally:
> that the bread of the Shabbat meals should be doubled.
The key word is משנה.
**AI**
> the bread of the Shabbat meals shall be double
**Urim**
> the bread of the Shabbat meal is to be double
AI preserves the plural “meals” (סעודות), whereas Urim changes it to singular.
Advantage: AI.
—
### מכילתא
Both correctly render:
> In the Mechilta…
No issue.
—
### לחם כפול
This is the most significant translation difference.
**Hebrew**
> לחם כפול
Literally:
> doubled bread
> folded/double bread
> bread that is doubled
**AI**
> doubled bread
**Urim**
> a pair of breads
This is an interpretive translation.
The Hebrew does **not** literally say “a pair of breads.” It says “bread doubled.”
However, Urim is trying to express the halachic implication—that two loaves are required.
From a translation standpoint:
* AI is more literal.
* Urim is more explanatory.
Advantage: AI for fidelity.
—
### ונראה שזהו דין תורה ולא אסמכתא בעלמא
Literally:
> It appears that this is a Torah law and not merely an asmachta.
**AI**
> a Torah law and not a mere asmakhta
**Urim**
> a biblical law and not merely a rabbinic law supported by the verse
This is another major difference.
The Hebrew says:
> אסמכתא בעלמא
which is a technical rabbinic term.
The AI preserves the technical term.
Urim explains the concept:
> a rabbinic law supported by the verse
This is pedagogically useful, but it is no longer a translation. It is interpretation.
For readers familiar with rabbinic terminology, the AI version is more faithful.
—
## 2. English against English
The two translations represent two distinct approaches.
### AI Translation
Characteristics:
* More literal.
* Preserves rabbinic terminology.
* Follows Hebrew sentence structure closely.
* Retains ambiguity where the Hebrew is ambiguous.
Examples:
> doubled bread
> Torah law
> asmakhta
Strengths:
* Better for students studying the Hebrew.
* Easier to reconstruct the original wording.
* More philologically accurate.
Weaknesses:
* Slightly awkward English.
* Assumes familiarity with rabbinic vocabulary.
—
### Urim Translation
Characteristics:
* More polished.
* More explanatory.
* Frequently translates concepts rather than words.
Examples:
> a pair of breads
instead of
> doubled bread
and
> a rabbinic law supported by the verse
instead of
> asmakhta
Strengths:
* More accessible to general readers.
* Reads like contemporary English.
* Explains technical concepts.
Weaknesses:
* Occasionally imports interpretation into the translation.
* Makes the text sound more definite than the Hebrew itself.
—
## 3. Places where Urim arguably over-translates
### “a pair of breads”
Hebrew:
> לחם כפול
Urim:
> a pair of breads
A better translation would be:
> doubled bread
with a footnote explaining that this is understood as requiring two loaves.
—
### “a rabbinic law supported by the verse”
Hebrew:
> אסמכתא בעלמא
This is a reasonable explanation of the concept, but it is commentary rather than translation.
A more faithful translation would be:
> not merely an asmachta
with a note explaining the term.
—
## Overall judgment
If the goal is **faithfulness to the Aruch HaShulchan’s wording**, the AI translation is generally closer to the Hebrew. It preserves technical terminology, follows the syntax more closely, and translates phrases like לחם כפול and אסמכתא more literally.
If the goal is **readability for an English-speaking lay audience**, the Urim translation is superior. It reads smoothly and explains rabbinic concepts that many readers would not know.
As a scholarly translation, I would probably combine the strengths of both:
> We learn in Shabbat (117b): “R. Abba said: On Shabbat a person is obligated to break bread over two loaves, as it is written, ‘double bread’ (lechem mishneh).” That is, concerning the manna it is written: “And it was on the sixth day that they gathered double bread, two omers for each person.” The phrase “double bread” is superfluous, for it could simply have stated, “they gathered two omers for each person”; otherwise it is merely repetitive. Therefore we expound it as an independent teaching: the bread used for the Shabbat meals must be doubled. The Mechilta likewise states: “‘Double bread’—R. Yehoshua says: ‘doubled bread.'” It appears that this is a Torah obligation and not merely an asmachta, for the phrase “double bread” is entirely superfluous, as explained above.
That version stays very close to the Hebrew while still reading naturally in English.
Unless you guide the AI with proper prompting, it will translate as it sees fit, and it could be wrong or off.
They need to be guided.
About four weeks ago I sat down to translate one old book, a Hebrew text I found scanned on HebrewBooks.org, into English. One turned into two, this time in Latin. Then I built a pipeline to run a couple at a time. Now I’ve got a factory. And a name.
Announcing Ars Astronomica, a boutique publisher of centuries-old books about the heavens. Some of the greatest scientific works in history, most in readable English for the first time. One was never even printed; it survives only as a handwritten manuscript.
Please check it out and share.
http://ArsAstronomica.com/
All works are free and no registration is required to download.