Output Devanagari (Hindi) from raw unicode using luatex Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)Conflict between color, graphicx and libertineXeTex - Times New Roman font for Romanian characters ș, ț, Ș and ȚIs LuaLaTeX producing faulty pdfs?Using a handwriting font from myscriptfont.comDevanagari/Indic in LuaTeXDevanagari Combined GlyphsVery multilingual work'table index is nil' error when using the Avenir font with fontspec + luatexTurkish characters do not appear end of the wordWho changed my Chinese character?

preposition before coffee

In musical terms, what properties are varied by the human voice to produce different words / syllables?

Trademark violation for app?

Girl Hackers - Logic Puzzle

What does it mean that physics no longer uses mechanical models to describe phenomena?

Do I really need to have a message in a novel to appeal to readers?

What does Turing mean by this statement?

Has negative voting ever been officially implemented in elections, or seriously proposed, or even studied?

Why can't I install Tomboy in Ubuntu Mate 19.04?

What order were files/directories output in dir?

AppleTVs create a chatty alternate WiFi network

Co-worker has annoying ringtone

Project Euler #1 in C++

A term for a woman complaining about things/begging in a cute/childish way

Strange behavior of Object.defineProperty() in JavaScript

How were pictures turned from film to a big picture in a picture frame before digital scanning?

What is best way to wire a ceiling receptacle in this situation?

Is there hard evidence that the grant peer review system performs significantly better than random?

Central Vacuuming: Is it worth it, and how does it compare to normal vacuuming?

What's the point of the test set?

How to write capital alpha?

Why are vacuum tubes still used in amateur radios?

Why do early math courses focus on the cross sections of a cone and not on other 3D objects?

What is an "asse" in Elizabethan English?



Output Devanagari (Hindi) from raw unicode using luatex



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)Conflict between color, graphicx and libertineXeTex - Times New Roman font for Romanian characters ș, ț, Ș and ȚIs LuaLaTeX producing faulty pdfs?Using a handwriting font from myscriptfont.comDevanagari/Indic in LuaTeXDevanagari Combined GlyphsVery multilingual work'table index is nil' error when using the Avenir font with fontspec + luatexTurkish characters do not appear end of the wordWho changed my Chinese character?










3















I can get the following code to compile, using luatex, with the Hindi/Devanagari characters correctly printed in the pdf:



documentclassarticle
usepackagefontspec
setmainfontTimes New Roman
newfontscriptDevanagarideva,dev2
newfontfacehindi[Script=Devanagari]Lohit-Devanagari.ttf

begindocument
Here is normal text.
hindi नमस्ते
enddocument


However, I'm using a program that outputs the tex and that won't allow me to type the Hindi script into my tex editor; instead, it will only give me the unicode version of the word, "नमस्ते", which is "<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>".



How can I get luatex to compile correctly from these raw code characters? What I want to compile (to produce a pdf with the single word "नमस्ते") is something like this:



documentclassarticle
usepackagefontspec
setmainfontTimes New Roman
newfontscriptDevanagarideva,dev2
newfontfacehindi[Script=Devanagari]Lohit-Devanagari.ttf

begindocument
Here is normal text.
hindi <U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>
enddocument


...but that won't work.










share|improve this question
























  • Can you get your program to output char"0928char"092Echar"0938char"094Dchar"0924 char"0947 instead of <U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>?

    – Mico
    3 hours ago






  • 1





    Yes, I could do that! What would the full script then need to look like?

    – lethalSinger
    3 hours ago











  • I'm afraid I cannot answer your question as I don't know which scripting tool you employ. I just posted an answer, though, which creates a Lua function that converts <U+0928><U+092E><U+0938><U+094D><U+0924><U+0947> to char"0928char"092Echar"0938char"094Dchar"0924 char"0947.

    – Mico
    2 hours ago















3















I can get the following code to compile, using luatex, with the Hindi/Devanagari characters correctly printed in the pdf:



documentclassarticle
usepackagefontspec
setmainfontTimes New Roman
newfontscriptDevanagarideva,dev2
newfontfacehindi[Script=Devanagari]Lohit-Devanagari.ttf

begindocument
Here is normal text.
hindi नमस्ते
enddocument


However, I'm using a program that outputs the tex and that won't allow me to type the Hindi script into my tex editor; instead, it will only give me the unicode version of the word, "नमस्ते", which is "<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>".



How can I get luatex to compile correctly from these raw code characters? What I want to compile (to produce a pdf with the single word "नमस्ते") is something like this:



documentclassarticle
usepackagefontspec
setmainfontTimes New Roman
newfontscriptDevanagarideva,dev2
newfontfacehindi[Script=Devanagari]Lohit-Devanagari.ttf

begindocument
Here is normal text.
hindi <U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>
enddocument


...but that won't work.










share|improve this question
























  • Can you get your program to output char"0928char"092Echar"0938char"094Dchar"0924 char"0947 instead of <U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>?

    – Mico
    3 hours ago






  • 1





    Yes, I could do that! What would the full script then need to look like?

    – lethalSinger
    3 hours ago











  • I'm afraid I cannot answer your question as I don't know which scripting tool you employ. I just posted an answer, though, which creates a Lua function that converts <U+0928><U+092E><U+0938><U+094D><U+0924><U+0947> to char"0928char"092Echar"0938char"094Dchar"0924 char"0947.

    – Mico
    2 hours ago













3












3








3








I can get the following code to compile, using luatex, with the Hindi/Devanagari characters correctly printed in the pdf:



documentclassarticle
usepackagefontspec
setmainfontTimes New Roman
newfontscriptDevanagarideva,dev2
newfontfacehindi[Script=Devanagari]Lohit-Devanagari.ttf

begindocument
Here is normal text.
hindi नमस्ते
enddocument


However, I'm using a program that outputs the tex and that won't allow me to type the Hindi script into my tex editor; instead, it will only give me the unicode version of the word, "नमस्ते", which is "<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>".



How can I get luatex to compile correctly from these raw code characters? What I want to compile (to produce a pdf with the single word "नमस्ते") is something like this:



documentclassarticle
usepackagefontspec
setmainfontTimes New Roman
newfontscriptDevanagarideva,dev2
newfontfacehindi[Script=Devanagari]Lohit-Devanagari.ttf

begindocument
Here is normal text.
hindi <U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>
enddocument


...but that won't work.










share|improve this question
















I can get the following code to compile, using luatex, with the Hindi/Devanagari characters correctly printed in the pdf:



documentclassarticle
usepackagefontspec
setmainfontTimes New Roman
newfontscriptDevanagarideva,dev2
newfontfacehindi[Script=Devanagari]Lohit-Devanagari.ttf

begindocument
Here is normal text.
hindi नमस्ते
enddocument


However, I'm using a program that outputs the tex and that won't allow me to type the Hindi script into my tex editor; instead, it will only give me the unicode version of the word, "नमस्ते", which is "<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>".



How can I get luatex to compile correctly from these raw code characters? What I want to compile (to produce a pdf with the single word "नमस्ते") is something like this:



documentclassarticle
usepackagefontspec
setmainfontTimes New Roman
newfontscriptDevanagarideva,dev2
newfontfacehindi[Script=Devanagari]Lohit-Devanagari.ttf

begindocument
Here is normal text.
hindi <U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>
enddocument


...but that won't work.







fonts luatex languages characters indic






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 1 hour ago









ShreevatsaR

28.2k873102




28.2k873102










asked 3 hours ago









lethalSingerlethalSinger

253




253












  • Can you get your program to output char"0928char"092Echar"0938char"094Dchar"0924 char"0947 instead of <U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>?

    – Mico
    3 hours ago






  • 1





    Yes, I could do that! What would the full script then need to look like?

    – lethalSinger
    3 hours ago











  • I'm afraid I cannot answer your question as I don't know which scripting tool you employ. I just posted an answer, though, which creates a Lua function that converts <U+0928><U+092E><U+0938><U+094D><U+0924><U+0947> to char"0928char"092Echar"0938char"094Dchar"0924 char"0947.

    – Mico
    2 hours ago

















  • Can you get your program to output char"0928char"092Echar"0938char"094Dchar"0924 char"0947 instead of <U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>?

    – Mico
    3 hours ago






  • 1





    Yes, I could do that! What would the full script then need to look like?

    – lethalSinger
    3 hours ago











  • I'm afraid I cannot answer your question as I don't know which scripting tool you employ. I just posted an answer, though, which creates a Lua function that converts <U+0928><U+092E><U+0938><U+094D><U+0924><U+0947> to char"0928char"092Echar"0938char"094Dchar"0924 char"0947.

    – Mico
    2 hours ago
















Can you get your program to output char"0928char"092Echar"0938char"094Dchar"0924 char"0947 instead of <U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>?

– Mico
3 hours ago





Can you get your program to output char"0928char"092Echar"0938char"094Dchar"0924 char"0947 instead of <U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>?

– Mico
3 hours ago




1




1





Yes, I could do that! What would the full script then need to look like?

– lethalSinger
3 hours ago





Yes, I could do that! What would the full script then need to look like?

– lethalSinger
3 hours ago













I'm afraid I cannot answer your question as I don't know which scripting tool you employ. I just posted an answer, though, which creates a Lua function that converts <U+0928><U+092E><U+0938><U+094D><U+0924><U+0947> to char"0928char"092Echar"0938char"094Dchar"0924 char"0947.

– Mico
2 hours ago





I'm afraid I cannot answer your question as I don't know which scripting tool you employ. I just posted an answer, though, which creates a Lua function that converts <U+0928><U+092E><U+0938><U+094D><U+0924><U+0947> to char"0928char"092Echar"0938char"094Dchar"0924 char"0947.

– Mico
2 hours ago










1 Answer
1






active

oldest

votes


















4














(added an extra operation in the Lua function 'conv' to address the OP's follow-up request)



Since you're using LuaLaTeX, here's a solution that employs a Lua function to convert strings of the form '<U%+(.-)>' to '\char"%1'; here, %+ represents the literal character + and %1 represents the non-greedy "capture" of the pattern (.-) -- in words: "0 or more characters other than >". In a second step, the Lua function converts any whitespace characters present in the string to explicit (interword) whitespace.



In addition, the code also sets up a LaTeX macro that acts as a front-end for the Lua function. Thus, one may call the Lua function via a conv<your string here> directive.



You can either manually encase the sequences of unicode code in conv... statements or, depending on how far you can get your program to do the work for you, instruct the scripting program to encase the sequences of unicode code in a conv... statements automatically.



enter image description here



documentclassarticle
usepackagefontspec
setmainfontTimes New Roman
newfontscriptDevanagarideva,dev2
newfontfacehindi[Script=Devanagari]Lohit-Devanagari.ttf

%%%% -- copy the next eight lines of code to your document --
usepackageluacode % for 'luacode' env. and 'luastringN' macro
beginluacode


function conv ( s ) 
s = s:gsub ( '<U%+(.-)>' , '\char"%1' )
tex.sprint ( ( s:gsub( '%s+' , '\ ' ) ) )
end


endluacode
newcommandconv[1]directluaconv(luastringN#1)

begindocument
Latin-alphabet text.

hindi नमस्ते

hindi conv<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>

hindi conv<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947> <U+0930><U+093E><U+091C>
enddocument





share|improve this answer




















  • 1





    This gets incredibly close. The only problem now is with breaks between words, which get ignored. E.g. "नमस्ते राज" (2 words) gets printed as "नमस्तेराज" (1 single word) even though there is the proper space between the unicode characters: "<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947> <U+0930><U+093E><U+091C>". How can I fix the spacing issue?

    – lethalSinger
    2 hours ago






  • 1





    @lethalSinger - Please see the updated answer I just posted. (The solution is to add a second gsub (short for "global substitution") operation.)

    – Mico
    1 hour ago












  • @lethalSinger -- Instead of inserting a second gsub step, the whitespace issue could also have been "solved" by changing s:gsub( '<U%+(.-)>' , '\char"%1' ) to s:gsub( '<U%+(.-)>' , '\char"%1' ); note the insertion of a pair of curly braces. IMNSHO, though, it's preferable -- and certainly more transparent, coding-wise -- to avoid slights of hand such as inserting an "empty TeX group" and to perform two separate gsub operations.

    – Mico
    18 mins ago












Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "85"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f485697%2foutput-devanagari-hindi-from-raw-unicode-using-luatex%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









4














(added an extra operation in the Lua function 'conv' to address the OP's follow-up request)



Since you're using LuaLaTeX, here's a solution that employs a Lua function to convert strings of the form '<U%+(.-)>' to '\char"%1'; here, %+ represents the literal character + and %1 represents the non-greedy "capture" of the pattern (.-) -- in words: "0 or more characters other than >". In a second step, the Lua function converts any whitespace characters present in the string to explicit (interword) whitespace.



In addition, the code also sets up a LaTeX macro that acts as a front-end for the Lua function. Thus, one may call the Lua function via a conv<your string here> directive.



You can either manually encase the sequences of unicode code in conv... statements or, depending on how far you can get your program to do the work for you, instruct the scripting program to encase the sequences of unicode code in a conv... statements automatically.



enter image description here



documentclassarticle
usepackagefontspec
setmainfontTimes New Roman
newfontscriptDevanagarideva,dev2
newfontfacehindi[Script=Devanagari]Lohit-Devanagari.ttf

%%%% -- copy the next eight lines of code to your document --
usepackageluacode % for 'luacode' env. and 'luastringN' macro
beginluacode


function conv ( s ) 
s = s:gsub ( '<U%+(.-)>' , '\char"%1' )
tex.sprint ( ( s:gsub( '%s+' , '\ ' ) ) )
end


endluacode
newcommandconv[1]directluaconv(luastringN#1)

begindocument
Latin-alphabet text.

hindi नमस्ते

hindi conv<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>

hindi conv<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947> <U+0930><U+093E><U+091C>
enddocument





share|improve this answer




















  • 1





    This gets incredibly close. The only problem now is with breaks between words, which get ignored. E.g. "नमस्ते राज" (2 words) gets printed as "नमस्तेराज" (1 single word) even though there is the proper space between the unicode characters: "<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947> <U+0930><U+093E><U+091C>". How can I fix the spacing issue?

    – lethalSinger
    2 hours ago






  • 1





    @lethalSinger - Please see the updated answer I just posted. (The solution is to add a second gsub (short for "global substitution") operation.)

    – Mico
    1 hour ago












  • @lethalSinger -- Instead of inserting a second gsub step, the whitespace issue could also have been "solved" by changing s:gsub( '<U%+(.-)>' , '\char"%1' ) to s:gsub( '<U%+(.-)>' , '\char"%1' ); note the insertion of a pair of curly braces. IMNSHO, though, it's preferable -- and certainly more transparent, coding-wise -- to avoid slights of hand such as inserting an "empty TeX group" and to perform two separate gsub operations.

    – Mico
    18 mins ago
















4














(added an extra operation in the Lua function 'conv' to address the OP's follow-up request)



Since you're using LuaLaTeX, here's a solution that employs a Lua function to convert strings of the form '<U%+(.-)>' to '\char"%1'; here, %+ represents the literal character + and %1 represents the non-greedy "capture" of the pattern (.-) -- in words: "0 or more characters other than >". In a second step, the Lua function converts any whitespace characters present in the string to explicit (interword) whitespace.



In addition, the code also sets up a LaTeX macro that acts as a front-end for the Lua function. Thus, one may call the Lua function via a conv<your string here> directive.



You can either manually encase the sequences of unicode code in conv... statements or, depending on how far you can get your program to do the work for you, instruct the scripting program to encase the sequences of unicode code in a conv... statements automatically.



enter image description here



documentclassarticle
usepackagefontspec
setmainfontTimes New Roman
newfontscriptDevanagarideva,dev2
newfontfacehindi[Script=Devanagari]Lohit-Devanagari.ttf

%%%% -- copy the next eight lines of code to your document --
usepackageluacode % for 'luacode' env. and 'luastringN' macro
beginluacode


function conv ( s ) 
s = s:gsub ( '<U%+(.-)>' , '\char"%1' )
tex.sprint ( ( s:gsub( '%s+' , '\ ' ) ) )
end


endluacode
newcommandconv[1]directluaconv(luastringN#1)

begindocument
Latin-alphabet text.

hindi नमस्ते

hindi conv<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>

hindi conv<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947> <U+0930><U+093E><U+091C>
enddocument





share|improve this answer




















  • 1





    This gets incredibly close. The only problem now is with breaks between words, which get ignored. E.g. "नमस्ते राज" (2 words) gets printed as "नमस्तेराज" (1 single word) even though there is the proper space between the unicode characters: "<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947> <U+0930><U+093E><U+091C>". How can I fix the spacing issue?

    – lethalSinger
    2 hours ago






  • 1





    @lethalSinger - Please see the updated answer I just posted. (The solution is to add a second gsub (short for "global substitution") operation.)

    – Mico
    1 hour ago












  • @lethalSinger -- Instead of inserting a second gsub step, the whitespace issue could also have been "solved" by changing s:gsub( '<U%+(.-)>' , '\char"%1' ) to s:gsub( '<U%+(.-)>' , '\char"%1' ); note the insertion of a pair of curly braces. IMNSHO, though, it's preferable -- and certainly more transparent, coding-wise -- to avoid slights of hand such as inserting an "empty TeX group" and to perform two separate gsub operations.

    – Mico
    18 mins ago














4












4








4







(added an extra operation in the Lua function 'conv' to address the OP's follow-up request)



Since you're using LuaLaTeX, here's a solution that employs a Lua function to convert strings of the form '<U%+(.-)>' to '\char"%1'; here, %+ represents the literal character + and %1 represents the non-greedy "capture" of the pattern (.-) -- in words: "0 or more characters other than >". In a second step, the Lua function converts any whitespace characters present in the string to explicit (interword) whitespace.



In addition, the code also sets up a LaTeX macro that acts as a front-end for the Lua function. Thus, one may call the Lua function via a conv<your string here> directive.



You can either manually encase the sequences of unicode code in conv... statements or, depending on how far you can get your program to do the work for you, instruct the scripting program to encase the sequences of unicode code in a conv... statements automatically.



enter image description here



documentclassarticle
usepackagefontspec
setmainfontTimes New Roman
newfontscriptDevanagarideva,dev2
newfontfacehindi[Script=Devanagari]Lohit-Devanagari.ttf

%%%% -- copy the next eight lines of code to your document --
usepackageluacode % for 'luacode' env. and 'luastringN' macro
beginluacode


function conv ( s ) 
s = s:gsub ( '<U%+(.-)>' , '\char"%1' )
tex.sprint ( ( s:gsub( '%s+' , '\ ' ) ) )
end


endluacode
newcommandconv[1]directluaconv(luastringN#1)

begindocument
Latin-alphabet text.

hindi नमस्ते

hindi conv<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>

hindi conv<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947> <U+0930><U+093E><U+091C>
enddocument





share|improve this answer















(added an extra operation in the Lua function 'conv' to address the OP's follow-up request)



Since you're using LuaLaTeX, here's a solution that employs a Lua function to convert strings of the form '<U%+(.-)>' to '\char"%1'; here, %+ represents the literal character + and %1 represents the non-greedy "capture" of the pattern (.-) -- in words: "0 or more characters other than >". In a second step, the Lua function converts any whitespace characters present in the string to explicit (interword) whitespace.



In addition, the code also sets up a LaTeX macro that acts as a front-end for the Lua function. Thus, one may call the Lua function via a conv<your string here> directive.



You can either manually encase the sequences of unicode code in conv... statements or, depending on how far you can get your program to do the work for you, instruct the scripting program to encase the sequences of unicode code in a conv... statements automatically.



enter image description here



documentclassarticle
usepackagefontspec
setmainfontTimes New Roman
newfontscriptDevanagarideva,dev2
newfontfacehindi[Script=Devanagari]Lohit-Devanagari.ttf

%%%% -- copy the next eight lines of code to your document --
usepackageluacode % for 'luacode' env. and 'luastringN' macro
beginluacode


function conv ( s ) 
s = s:gsub ( '<U%+(.-)>' , '\char"%1' )
tex.sprint ( ( s:gsub( '%s+' , '\ ' ) ) )
end


endluacode
newcommandconv[1]directluaconv(luastringN#1)

begindocument
Latin-alphabet text.

hindi नमस्ते

hindi conv<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947>

hindi conv<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947> <U+0930><U+093E><U+091C>
enddocument






share|improve this answer














share|improve this answer



share|improve this answer








edited 1 hour ago

























answered 2 hours ago









MicoMico

287k32393781




287k32393781







  • 1





    This gets incredibly close. The only problem now is with breaks between words, which get ignored. E.g. "नमस्ते राज" (2 words) gets printed as "नमस्तेराज" (1 single word) even though there is the proper space between the unicode characters: "<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947> <U+0930><U+093E><U+091C>". How can I fix the spacing issue?

    – lethalSinger
    2 hours ago






  • 1





    @lethalSinger - Please see the updated answer I just posted. (The solution is to add a second gsub (short for "global substitution") operation.)

    – Mico
    1 hour ago












  • @lethalSinger -- Instead of inserting a second gsub step, the whitespace issue could also have been "solved" by changing s:gsub( '<U%+(.-)>' , '\char"%1' ) to s:gsub( '<U%+(.-)>' , '\char"%1' ); note the insertion of a pair of curly braces. IMNSHO, though, it's preferable -- and certainly more transparent, coding-wise -- to avoid slights of hand such as inserting an "empty TeX group" and to perform two separate gsub operations.

    – Mico
    18 mins ago













  • 1





    This gets incredibly close. The only problem now is with breaks between words, which get ignored. E.g. "नमस्ते राज" (2 words) gets printed as "नमस्तेराज" (1 single word) even though there is the proper space between the unicode characters: "<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947> <U+0930><U+093E><U+091C>". How can I fix the spacing issue?

    – lethalSinger
    2 hours ago






  • 1





    @lethalSinger - Please see the updated answer I just posted. (The solution is to add a second gsub (short for "global substitution") operation.)

    – Mico
    1 hour ago












  • @lethalSinger -- Instead of inserting a second gsub step, the whitespace issue could also have been "solved" by changing s:gsub( '<U%+(.-)>' , '\char"%1' ) to s:gsub( '<U%+(.-)>' , '\char"%1' ); note the insertion of a pair of curly braces. IMNSHO, though, it's preferable -- and certainly more transparent, coding-wise -- to avoid slights of hand such as inserting an "empty TeX group" and to perform two separate gsub operations.

    – Mico
    18 mins ago








1




1





This gets incredibly close. The only problem now is with breaks between words, which get ignored. E.g. "नमस्ते राज" (2 words) gets printed as "नमस्तेराज" (1 single word) even though there is the proper space between the unicode characters: "<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947> <U+0930><U+093E><U+091C>". How can I fix the spacing issue?

– lethalSinger
2 hours ago





This gets incredibly close. The only problem now is with breaks between words, which get ignored. E.g. "नमस्ते राज" (2 words) gets printed as "नमस्तेराज" (1 single word) even though there is the proper space between the unicode characters: "<U+0928><U+092E><U+0938><U+094D><U+0924><U+0947> <U+0930><U+093E><U+091C>". How can I fix the spacing issue?

– lethalSinger
2 hours ago




1




1





@lethalSinger - Please see the updated answer I just posted. (The solution is to add a second gsub (short for "global substitution") operation.)

– Mico
1 hour ago






@lethalSinger - Please see the updated answer I just posted. (The solution is to add a second gsub (short for "global substitution") operation.)

– Mico
1 hour ago














@lethalSinger -- Instead of inserting a second gsub step, the whitespace issue could also have been "solved" by changing s:gsub( '<U%+(.-)>' , '\char"%1' ) to s:gsub( '<U%+(.-)>' , '\char"%1' ); note the insertion of a pair of curly braces. IMNSHO, though, it's preferable -- and certainly more transparent, coding-wise -- to avoid slights of hand such as inserting an "empty TeX group" and to perform two separate gsub operations.

– Mico
18 mins ago






@lethalSinger -- Instead of inserting a second gsub step, the whitespace issue could also have been "solved" by changing s:gsub( '<U%+(.-)>' , '\char"%1' ) to s:gsub( '<U%+(.-)>' , '\char"%1' ); note the insertion of a pair of curly braces. IMNSHO, though, it's preferable -- and certainly more transparent, coding-wise -- to avoid slights of hand such as inserting an "empty TeX group" and to perform two separate gsub operations.

– Mico
18 mins ago


















draft saved

draft discarded
















































Thanks for contributing an answer to TeX - LaTeX Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f485697%2foutput-devanagari-hindi-from-raw-unicode-using-luatex%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Are there any AGPL-style licences that require source code modifications to be public? Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Announcing the arrival of Valued Associate #679: Cesar Manara Unicorn Meta Zoo #1: Why another podcast?Force derivative works to be publicAre there any GPL like licenses for Apple App Store?Do you violate the GPL if you provide source code that cannot be compiled?GPL - is it distribution to use libraries in an appliance loaned to customers?Distributing App for free which uses GPL'ed codeModifications of server software under GPL, with web/CLI interfaceDoes using an AGPLv3-licensed library prevent me from dual-licensing my own source code?Can I publish only select code under GPLv3 from a private project?Is there published precedent regarding the scope of covered work that uses AGPL software?If MIT licensed code links to GPL licensed code what should be the license of the resulting binary program?If I use a public API endpoint that has its source code licensed under AGPL in my app, do I need to disclose my source?

2013 GY136 Descoberta | Órbita | Referências Menu de navegação«List Of Centaurs and Scattered-Disk Objects»«List of Known Trans-Neptunian Objects»

Mortes em março de 2019 Referências Menu de navegação«Zhores Alferov, Nobel de Física bielorrusso, morre aos 88 anos - Ciência»«Fallece Rafael Torija, o bispo emérito de Ciudad Real»«Peter Hurford dies at 88»«Keith Flint, vocalista do The Prodigy, morre aos 49 anos»«Luke Perry, ator de 'Barrados no baile' e 'Riverdale', morre aos 52 anos»«Former Rangers and Scotland captain Eric Caldow dies, aged 84»«Morreu, aos 61 anos, a antiga lenda do wrestling King Kong Bundy»«Fallece el actor y director teatral Abraham Stavans»«In Memoriam Guillaume Faye»«Sidney Sheinberg, a Force Behind Universal and Spielberg, Is Dead at 84»«Carmine Persico, Colombo Crime Family Boss, Is Dead at 85»«Dirigent Michael Gielen gestorben»«Ciclista tricampeã mundial e prata na Rio 2016 é encontrada morta em casa aos 23 anos»«Pagan Community Notes: Raven Grimassi dies, Indianapolis pop-up event cancelled, Circle Sanctuary announces new podcast, and more!»«Hal Blaine, Wrecking Crew Drummer, Dies at 90»«Morre Coutinho, que editou dupla lendária com Pelé no Santos»«Cantor Demétrius, ídolo da Jovem Guarda, morre em SP»«Ex-presidente do Vasco, Eurico Miranda morre no Rio de Janeiro»«Bronze no Mundial de basquete de 1971, Laís Elena morre aos 76 anos»«Diretor de Corridas da F1, Charlie Whiting morre aos 66 anos às vésperas do GP da Austrália»«Morreu o cardeal Danneels, da Bélgica»«Morreu o cartoonista Augusto Cid»«Morreu a atriz Maria Isabel de Lizandra, de "Vale Tudo" e novelas da Tupi»«WS Merwin, prize-winning poet of nature, dies at 91»«Atriz Márcia Real morre em São Paulo aos 88 anos»«Mauritanie: décès de l'ancien président Mohamed Mahmoud ould Louly»«Morreu Dick Dale, o rei da surf guitar e de "Pulp Fiction"»«Falleció Víctor Genes»«João Carlos Marinho, autor de 'O Gênio do Crime', morre em SP»«Legendary Horror Director and SFX Artist John Carl Buechler Dies at 66»«Morre em Salvador a religiosa Makota Valdina»«مرگ بازیکن‌ سابق نساجی بر اثر سقوط سنگ در مازندران»«Domingos Oliveira morre no Rio»«Morre Airton Ravagniani, ex-São Paulo, Fla, Vasco, Grêmio e Sport - Notícias»«Morre o escritor Flavio Moreira da Costa»«Larry Cohen, Writer-Director of 'It's Alive' and 'Hell Up in Harlem,' Dies at 77»«Scott Walker, experimental singer-songwriter, dead at 76»«Joseph Pilato, Day of the Dead Star and Horror Favorite, Dies at 70»«Sheffield United set to pay tribute to legendary goalkeeper Ted Burgin who has died at 91»«Morre Rafael Henzel, sobrevivente de acidente aéreo da Chapecoense»«Morre Valery Bykovsky, um dos primeiros cosmonautas da União Soviética»«Agnès Varda, cineasta da Nouvelle Vague, morre aos 90 anos»«Agnès Varda, cineasta francesa, morre aos 90 anos»«Tania Mallet, James Bond Actress and Helen Mirren's Cousin, Dies at 77»e