Questions tagged [utf-8]

UTF-8 is a character encoding that describes each Unicode code point using a byte sequence of one to four bytes. It is backwards-compatible with ASCII while still supporting representation of all Unicode code points.

0
votes
0answers
7 views

How can I set encoding on AWS Elastic Beanstalk?

I've got a Java application running on AWS Elastic Beanstalk that is reading form an InputStream from a website and uploading something to my FTP. Some of the data, that is downloaded is in UTF-8, ...
0
votes
1answer
23 views

I get an Ansi string instead of Utf-8 from Utf-8 mysql table

When I moved from php mysql shared hosting to my own VPS I've found that code which outputs user names in UTF8 from mysql database outputs ?�??????� instead of 鬼神❗. My page has utf-8 encoding, and I ...
0
votes
0answers
12 views

PHP (PDO) and SQL (phpmyadmin) characters from form are inccorect [duplicate]

I want to ask, what am i doing wrong? I want take data from form a store it in database, but when i look to database there are incorrect shown to me (?#/), but when i entered it directly in phpmyadmin ...
0
votes
0answers
6 views

REST service is decoding parameters twice

I have a problem where my REST service host keeps decoding parameters twice. So far i havent found a way to turn that off. For example, if a want to pass the string "test#test" in the browser, i have ...
0
votes
0answers
10 views

Character encoding issue on Jenkins Slave

I am using Jenkins in Master Slave configuration using Docker. one of the job is working perfectly on master but upon shifting it to slave it is unable to display file names which contains French ...
0
votes
0answers
14 views

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 398: invalid start byte [on hold]

This error occurred while trying to retrieve image from data.pr4e.org. I don't know how to fix this error. i have tried inserting str=str.decode(('unicode_escape').encode('utf-8')) but code still ...
-1
votes
1answer
18 views

print non ascii text in python

I am using python2 and I want to convert the non utf-8 text into readable string. I am trying decode using latin-1 and utf-8 also. But I am getting no success. This is the string s = ' ¤¿à...
0
votes
1answer
26 views

c++ how to print utf8 symbol using utf8 hex code?

For example I have utf8 hex code of a symbol which is: e0a4a0 how can I use this hex code to print this character ठ to a file in c++?
0
votes
1answer
30 views

broken UTF8 characters in MySQL using PHP

I have noticed when dealing with some names that are not of normal spelling ie standard alphabet UK/US are getting lost from my inserting of a record to what actually shows up in the database. I have ...
3
votes
3answers
51 views

Writing to JSON - Converting \u00a3 to £

I am using Selenium and python to scrape a website. I am scraping some '£' Characters, however I am getting this instead: \u00a3, when writing to JSON (they appear as '£' with I print them to terminal)...
0
votes
0answers
22 views

EPPLUS saving XLSX file while using UTF-8 encoding without BOM

I'm using EPPlus (4.5.2.1) and I have a question about the encoding of the XLSX and its underlying XML parts. I must use C# code inside a project that is unzipping the XLSX file and then reading the "...
0
votes
0answers
13 views

How correctly log accentued characters in DB?

What I checked : PHPStorm encoding : UTF-8 whithout BOM. In PDO : charset=UTF8 mb_detect_encoding = UTF-8 In PHPMyAdmin : Connexion to the server : UTF-8_bin DB, table and columns : UTF-8 bin ...
-1
votes
0answers
28 views

How to convert a Unicode (UTF-8) text file to PDF?

I would like to convert a text file containing Unicode characters in UTF-8 to a PDF file. When I cat the file or look at it with vim, everything is great, but when I open the file with LibreOffice, ...
0
votes
1answer
57 views

Convert utf-8 to single-byte encoding

I have a batch of wrongfully encoded records. This one-liner gives me out a correct result cat example.txt | iconv -f utf-8 -t iso8859-2 But the following program give me an error encoding: rune not ...
1
vote
2answers
29 views

Reading Chinese text from a file and printing it to the shell

I'm trying to make a program which reads lines Chinese characters from a .txt file and prints them to the Python shell (IDLE?). The problem I'm having is trying to encode and decode the characters ...
-2
votes
0answers
15 views

Decoding UTF-8 for AWS gameday scavenger hunt

im doing aws gameday scavenger hunt right now, can someone help me lol so the clue is : https://s3-us-west-2.amazonaws.com/s3scavengerhunt/clue.txt as you can see its s3 site with txt file said "...
0
votes
2answers
20 views

Which encoding to open utf-8 csv file in Python which opens correctly in Excel with Windows (ANSI)

I have a database export in csv which is UTF8 encoded. When i open it in Excel, i have to choose Windows (ANSI) at opening in order to see special characters correctly displays (é, è, à for instance). ...
0
votes
0answers
8 views

laravel-snappy (wkhtmltopdf) title utf-8 encoding

I have a snappy generated base64 SVG embedded to my page in an <object> tag dynamically with React. My problem is that the title option of wkhtmltopdf doesn't respect UTF-8 characters. And the ...
-1
votes
1answer
12 views

include characters UTF-8 in prompt

I want to include the character ` (grave accent) in the prompt. However, I did not find any code such as \xe2\x88\x80 as I saw for other characters. What is the corresponding code in this case? ...
0
votes
1answer
14 views

How can I get the value in utf-8 from an axios get receiving iso-8859-1 in node.js

I have the following code: const notifications = await axios.get(url) const ctype = notifications.headers["content-type"]; The ctype receives "text/json; charset=iso-8859-1" And my string is like ...
0
votes
1answer
29 views

U+FFFD � special characters are being inserted inside a string in PHP [duplicate]

So I'm trying to prepend a <br> tag in front of the longest word encountered in a given string in PHP. The strings I'm working with may also contain characters from various languages, but all ...
2
votes
1answer
76 views

How to handle UTF-8 emoji in sed?

I've seen many topics about escaping and replacing a special character in SED, but none of them helped me. I have this sed command I need to use on a file: sed -i "s/This[^\|]\+/& (cool) /g" "...
-1
votes
1answer
48 views

Manage a Chinese .txt in C++

i'm trying to show in the console a cinese text, it has been pasted from wikipedia in a .txt file (i don't know the codification, maybe UTF-8?) // reading a text file #include <iostream> #...
-4
votes
1answer
24 views

Python convert utf-8 back to string

I have a string which looks like a ='Verm\xc3\xb6gensverzeichnis' When i do print(a), it shows me the right result, which is Vermögensverzeichnis. print(a) Vermögensverzeichnis What i want to ...
0
votes
0answers
19 views

Juypter Notebook validation failed

I am using JetBrains PyCharm Community Edition 2018.2.4 and I use jupyter notebook via PyCharm. When I open my jupyter notes in browser I receive this error: The save operation succeeded, but the ...
0
votes
0answers
10 views

mFast Library : How to convert the feeds received from exchange to UTF-8 encoded hex format?

I am facing an issue in mFast (https://github.com/objectcomputing/mFAST) library. When i recieve the data from exchange and convert it to UTF-8 encoded hex format. The data received example ( "Ýÿ”...
0
votes
0answers
38 views

I have a problem with one of the exporting of excel/csv in symfony 2.8 application

I have a problem with csv uploading in a symfony 2.8 application but it is only occurring in one client. The log from CloudWatch: CRITICAL: Uncaught PHP Exception RuntimeException: "Your data ...
0
votes
1answer
24 views

How to address Compatibility Error with ruby

I have a ruby program that parses a large block of text with a number of regular expressions. The problem I'm having is that anytime the text contains 'special characters' (for example Kuutõbine or ...
-1
votes
2answers
82 views

The output is the same as input. How to fix?

ANSI to UTF-8 converter. The main problem is that the output is the same as input. How to fix it? #include <windows.h> #include <stdio.h> #include <stdlib.h> int main(int argc, ...
-1
votes
1answer
25 views

Convert file to UTF-8 and preserve modification timestamp

Converting files (in this case ISO-8859-1) to UTF-8 is pretty easy in Linux. Have been using: find . -name "*.txt" -exec iconv -f ISO-8859-1 -t UTF-8 {} -o {}.utf8 \; vim "+set nomore" "+bufdo set ...
0
votes
1answer
18 views

How to identify if text encoding issue is my processing error or carried from the source pdf

I have a selection of pdfs that I want to text mine. I use tika to parse the text out of each pdf and save to a .txt with utf-8 encoding (I'm using windows) Most of the pdfs were OCR'd before I got ...
-2
votes
2answers
71 views

emojis show up as question marks after inserting into database php

I know this has been asked many times before and I have tried nearly all of them but it gets a different, so please read on! Do not downvote! As I said I have tried most of the solutions! I have used ...
0
votes
1answer
18 views

Can't load CSV File into MySql Workbench via Import Wizard

when I try to load this CSV into MySQl DB I get following error. Platz;Team;Saison;Spieltag;Punkte;Sieg;Unentschieden;Niederlage;Geschossen;Bekommen;Differenz;berUnter;HeimAuswrts;Gegner;...
0
votes
1answer
36 views

Is Python 2.7 actually converting my string to UTF-8 or is the definition of isalnum() different across different machines?

My sample.txt: é Roméo et Juliette vécu heureux chaque après My program: #!/usr/bin/env python2.7 # -*- coding: utf-8 -*- with open("test4", "r") as f: s = f.read() print(s) ...
0
votes
0answers
35 views

Pandas returns UnicodeDecodeError:

While executing below code: import pandas as pd l = [{"id":"1","nazwa":"ĄĆĘŁŻŹąćęłżź"}, {"id":"2","nazwa":"nazwa2"}, {"id":"3","nazwa":"nazwa3"} ] df = pd.DataFrame(l) df.to_csv("file1.csv....
1
vote
0answers
27 views

Java byte array sent with contentType: application/octetstream losing negative byte values

I have an array of integers in java, and I am converting it to a ByteArray public byte[] intArrayToByteArray(int[] arr) { // create byte buffer ByteBuffer byteBuf = ByteBuffer.allocate(arr....
1
vote
1answer
32 views

Slash(/) in URI gives HTTP 400 Bad request error from Tomcat

I am using Tomcat 7.0.54.0. There is one REST API in my application using the GET method. In the URI for this API, there is one slash character (/) which I am encoding before calling the API. That ...
-1
votes
1answer
46 views

Python - Changing the accent characters in a JSON file to a regular character

I currently have a JSON file that I am trying to load in python, however because of accent characters I am getting errors. Is there a way for me to replace the accent characters to regular ...
1
vote
1answer
29 views

How to turn unicode character into \Uxxxxxxxx format in python 3

I have an unicode character like 🏆 and I want to get back the \Uxxxxxxxx format. But until now, couldn't find an easy way. Already tried: text = 🏆 text.encode('utf-32').decode('utf-8') returns ...
0
votes
0answers
21 views

Convert from utf-8 character to hex using javascript (steganography)

Below is the JavaScript code that converts ASCII characters to hex image. It converts the original text into hexadecimal format and groups them by three letters, which will be drawn as a single ...
0
votes
2answers
36 views

Unicode issue: How to convert ’ to ’ in the response from HttpClient?

The String s and byte[] b in the code below contain different representations of roughly the same thing. import java.io.UnsupportedEncodingException; import java.nio.charset.Charset; import org....
0
votes
0answers
18 views

nroff/groff does not properly convert utf-8 encoded file

I am having a utf-8 encoded roff-file that I want to convert to a manpage with $ nroff -mandoc inittab.5 However, characters in [äöüÄÖÜ], e.g. are not displayed properly as it seems that nroff ...
0
votes
1answer
16 views

What is the longest UTF8 representation of an NFC-form string of a given length?

Context. I'm writing C to the iCal (RFC 5545) spec. It specifies the maximum length of a delimited line to be 75 octets excluding the delimiter. Both the robustness principle and the W3C character ...
0
votes
1answer
20 views

How to decode utf-16 emoji surrogate pairs into uf8-8 and display them correctly in html?

I have a string which contains xml. It has the following substring <Subject>&amp;#55357;&amp;#56898;&amp;#55357;&amp;#56838;&amp;#55357;&amp;#56846;&amp;#55357;&...
0
votes
1answer
48 views

Running shell scripts with special chars in unix

I'm having and tedious problem with my shell script. It copies a file from another server to its. The trouble is here: The file to be copied has a special char in his name, like this: "CDACampaña". ...
1
vote
2answers
32 views

PHP and C# Different Unicode Output

Hi I have a multilingual line: "Hindi - हिंदी , Chinese - 痴呢色 ,Russian - руссиан" I need to perform an URLEncode on it after cnverting it into UTF8 in both PHP and C#. <?php $sms_text ='Hindi - ...
1
vote
0answers
25 views

Apache httpclient HTTPGET decoding

I'm consuming an API which encodes german umlauts in UTF-8. My application needs to decode and encode these characters. Encoding is not a problem an works totally fine. To achieve that I set a charset ...
0
votes
0answers
8 views

Change excel encoding to UTF-8 in Microsot Excel in MAC

I am using Microsoft Excel 16.16.1 on Mac 10.13.6 and I want to change the encoding of excel the UTF-8 since the letters appears to be presented wrong: When I open the file using TextEdit this text ...
0
votes
2answers
34 views

Unicode characters in XSLT

I have a problem with showing the characters in unicode encoding. For example, in XML I have a text which I transfer to html with help of XSLT. The text is for example "Najlepší" and characters "š" ...
-2
votes
1answer
35 views

Python importing .csv files in utf-8 or cp1252

I asked a question a while back about dealing with import of .csv files with special characters. At the time I was interested in solving the 90% case, but now I'm back for the last 10%. It's mostly ...