Solved Processing unicode over U+10000

Discussion in 'Spigot Plugin Development' started by joeleoli, Apr 25, 2017.

  1. I'm storing a display format in my database that should contain unicode characters, but the database doesn't support it. Because of this, I'm trying to make a little parsing util that will convert it for me using StringEscapeUtils, but StringEscapeUtils won't support unicode characters over U+10000.

    Storing this in database
    Code (Text):
    §7<10031>PREMIUM
    Then I replace
    Code (Text):
    <100031>
    with
    Code (Text):
    StringEscapeUtils.unescapeJava("\u10003")
    yet it doesn't work. It converts it to "\u1000" and then adds a 3 at the end.

    Any ideas?
     
  2. What kind of database are you using? Looks like you simply need to change the collation of an SQL database to one that supports UTF8, which saves you the encode/decode trouble.
     
    • Agree Agree x 2
  3. @joeleoli, i'm pretty sure, that you're messing up decimal and hexadecimal representations?
    https://en.wikipedia.org/wiki/Unicode unicode characters are 2 bytes long.
    \u... uses hex values. So \u0000 until \uFFFF. There is no \u10031.
    Decimal 10031 is hex 272F. So i guess you're searching for \u272F ().