Thanks Sergey
Your suggestion inspired me to do some more trials.
Declaring the text column
VARCHAR(3000) instead of NVARCHAR(1000)
the problem disappears, but I have to increase the column size because
somehow Virtuoso recognizes the text as non-Latin and uses UTF-8 that
occupies more space (till 4 times more than the default encoding).
I cannot declare my text columns VARCHAR, because I must be able to host
non-Latin text till 4000 characters long. In order to do this I should
declare the column
VARCHAR(16000), whereas the maximum size is 4082. Unfortunately it seem the
use of VARCHAR columns is not a solution for me...
Yet as the Virtuoso documentation
<http://docs.openlinksw.com/virtuoso/datatypes/>says that NVARCHAR is
dedicated to Unicode:
*"There are 3 additional data types to enable storing of Unicode data: *
-
*NCHAR*
-
*NVARCHAR*
-
*LONG NVARCHAR*
* All the Unicode types are equivalent to their corresponding "narrow" type
- CHAR, VARCHAR and LONG VARCHAR - except that instead of storing data as
one byte they allow Unicode characters. Their lengths are defined and
returned in characters instead of bytes."*
The data in my project are loaded with an ETL process implemented using the
Virtuoso bulk load feature (DB.DBA.csv_register() and DB.DBA.csv_loader_run()
procedures).
Can I still I use the ODBC / JDBC drivers to load my data into the RDB as
somebody suggested?
Cheers
Beppe
2017-03-21 11:20 GMT+00:00 Sergey Malinin <sergmali...@gmail.com>:
> On 03/21/2017 04:42 AM, Beppe Mazzola wrote:
>
>> Related to the problem I have about storing non-Latin character in
>> Virtuoso RDB, I see now that I haven't provided all the info related to it.
>>
>> So it was not evident that the Hungarian text written in the DB was not
>> the same read from it immediately after, despite the Virtuoso UTF-8 set-up
>> has been performed
>>
>> I bring here again the test case (now simplified and completed) to
>> reproduce the problem.
>>
>> It has been performed with Conductor Interactive SQL.
>>
>> It looks, that Conductor Interactive SQL does something wrong, when SQL
> insert(and etc) queries are executed, it inserts wrong data to table.
> But Conductor Interactive SQL shows unicode data properly, if it was
> inserted via ODBC/JDBC/ADO.NET .
>
> Also note:
> 1) you don't need to set "SQL_UTF8_EXECS = 1" in virtuoso.ini
> http://docs.openlinksw.com/virtuoso/wideidentifiers/
> for simple insert unicode data to columns.
>
> 2) for insert unicode values, the prefix N must be used
> insert into test values(1, N'АБВГ')
>
> 3) try insert unicode data via ODBC/JDBC/ADO.NET drivers, it must be
> inserted properly.
>
>
>
>> 1) log in as dba
>>
>> 2) Execute the following statements
>>
>> CREATE TABLE DB.test.HST_FTN
>> (
>> FOOTNOTES_ID NUMERIC NOT NULL,
>> FOOTNOTE_TEXT NVARCHAR(1000) NOT NULL,
>> CONSTRAINT PK_V_SEM_MRL_HST_FTN PRIMARY KEY ( FOOTNOTES_ID )
>> );
>>
>> INSERT INTO DB.test.HST_FTN
>> (FOOTNOTES_ID,FOOTNOTE_TEXT)
>> VALUES (4,'След тази дата МДГОВ ще бъде 0,05(*) mg/kg, освен ако не бъде
>> изменена с директива или регламент.');
>>
>> SELECT * FROM DB.test.HST_FTN ;
>>
>> 3) This is the result of the selectdisplayed, no longer the original text
>>
>> Query result:
>> FOOTNOTES_ID
>> DECIMAL FOOTNOTE_TEXT
>> NVARCHAR
>> 4 След тази дата МДГОВ ще бъде 0,05(*)
>> mg/kg, Ð¾Ñ Ð²ÐµÐ½ ако не бъде изменена Ñ
>> директива или регламент.
>> No. of rows in result: 1
>>
>> Here below again is the Virtuoso set up performed for UTF-8 .
>>
>> * all the table columns containing non-Latin text declared as
>> NVARCHAR(n)
>>
>> * in virtuoso.ini file
>> SQL_UTF8_EXECS = 1 ... [HTTPServer] Charset = UTF-8 ; this is the
>> default value
>> I do not know if there is still some set up missing or if it's a Virtuoso
>> issue. It's not clear to me as well whether the text is storied corrupted
>> in the DB by the INSERT statement or it is somehow corrupted later, when
>> Conductor reads it from the DB and returns it to the browser.
>> Any idea, hint are very welcome!
>> Thanks Cheers
>> Beppe
>> 2017-02-27 10:21 GMT+00:00 Beppe Mazzola <beppemazz...@gmail.com <mailto:
>> beppemazz...@gmail.com>>:
>>
>> Hi Hugh
>> This should reproduce the problem. In my Virtuoso installation the
>> Bulgarian text inserted in the test table is different to the one displayed
>> as a result
>> of the select.
>> 1) Create a user with name test with the following setting.
>> [picture removed]
>> 2) log in as user test
>> 3) Execute the following statements CREATE TABLE DB.test.HST_FTN (
>> FOOTNOTES_ID NUMERIC NOT NULL, FOOTNOTE_TEXT NVARCHAR(1000) NOT NULL,
>> CONSTRAINT
>> PK_V_SEM_MRL_HST_FTN PRIMARY KEY ( FOOTNOTES_ID ) );
>> INSERT INTO DB.test.HST_FTN
>> (FOOTNOTES_ID,FOOTNOTE_TEXT) VALUES (4,'След тази дата МДГОВ ще бъде
>> 0,05(*) mg/kg, освен ако не бъде изменена с директива или регламент.');
>> SELECT * FROM
>> DB.test.HST_FTN ;
>> Cheers
>> Beppe
>> 2017-02-25 22:42 GMT+00:00 Hugh Williams <hwilli...@openlinksw.com
>> <mailto:hwilli...@openlinksw.com>>:
>>
>> Hi Beppe,
>> That should be all that is required on the server for Unicode /
>> UTF-8 support.
>> Do you have a test case for recreating the problem being
>> observed, which should show how the data is being inserted and the queried ?
>> Best Regards
>> Hugh Williams
>> Professional Services
>> OpenLink Software, Inc. // http://www.openlinksw.com/
>> Weblog -- http://www.openlinksw.com/blogs/ <
>> http://www.openlinksw.com/blogs/>
>> LinkedIn -- http://www.linkedin.com/company/openlink-software/ <
>> http://www.linkedin.com/company/openlink-software/>
>> Twitter -- http://twitter.com/OpenLink
>> Google+ -- http://plus.google.com/100570109519069333827/ <
>> http://plus.google.com/100570109519069333827/>
>> Facebook -- http://www.facebook.com/OpenLinkSoftware <
>> http://www.facebook.com/OpenLinkSoftware>
>> Universal Data Access, Integration, and Management Technology
>> Providers
>>
>>> On 22 Feb 2017, at 14:20, Beppe Mazzola <beppemazz...@gmail.com
>>> <mailto:beppemazz...@gmail.com>> wrote:
>>> Hi to all
>>> In order to have Virtuoso relational DB able to correctly store
>>> the text with the UTF-8 encoding I have performed the following operations.
>>>
>>> * all the columns with text declared as NVARCHAR(n)
>>> * in virtuoso.ini file
>>> o SQL_UTF8_EXECS = 1
>>> o Charset = UTF-8 ; this is
>>> the default value
>>> This is not enough, since I see that the non-Latin characters
>>> are incorrectly stored in the DB yet.
>>> Do I miss any further set-up?
>>> I am using the following Virtuoso Open Source Version Virtuoso
>>> Open Source Edition (Column Store) (multi threaded) Version
>>> 7.2.4.2.3217-pthreads as
>>> of Feb 3 2017 Compiled for Linux (x86_64-unknown-linux-gnu)
>>> Thanks!
>>> Cheers
>>> Beppe
>>>
>>> ------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, SlashDot.org <http://slashdot.org>!
>>> http://sdm.link/slashdot_______________________________________________
>>> <http://sdm.link/slashdot___________________________________
>>> ____________> Virtuoso-users mailing list Virtuoso-users@lists.sourcefor
>>> ge.net
>>> <mailto:Virtuoso-users@lists.sourceforge.net>
>>> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
>>> <https://lists.sourceforge.net/lists/listinfo/virtuoso-users>
>>>
>>
>> ------------------------------------------------------------
>> ------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>
>> _______________________________________________
>> Virtuoso-users mailing list
>> Virtuoso-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
>>
>
> --
> Best Regards,
> Sergey Malinin
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users