Showing posts with label Chinese. Show all posts
Showing posts with label Chinese. Show all posts

01 September 2025

How to extract texts from an image on ubuntu?

To extract English texts - 
$ sudo apt install tesseract-ocr
$ tesseract your_image_name.png extracted_text.txt 

 

To extract Simplified Chinese texts -

$ sudo apt install tesseract-ocr tesseract-ocr-chi-sim

$ tesseract your_image.tiff output_text.txt -l chi_sim

 

To extract Traditional Chinese texts -

$ sudo apt install tesseract-ocr tesseract-ocr-chi-tra

$ tesseract your_image.tiff output_text.txt -l chi_tra

 

To extract multiple languages, e.g. English and Simplified Chinese, and Traditional Chinese texts -

$ tesseract your_image.tiff output_text.txt -l eng+chi_sim+chi_tra

14 March 2023

How to install Chinese input on Ubuntu?

To install Chinese Pinyin input on Ubuntu, you'll first install the ibus-pinyin package, then configure it within the Settings app. After installing the necessary packages and logging out and back in, you can add the Chinese Pinyin input source in the Region & Language settings.

Here's a more detailed breakdown:


1.
Install ibus-pinyin:

    • Open a terminal (Ctrl+Alt+T).
    • Run sudo apt install ibus-pinyin.
    • Restart ibus: ibus restart.
    • Configure ibus: ibus-setup.
      • In the ibus-setup window, click the "Input Method" tab, then "Add".
      • Select "Chinese" and then "Pinyin".
  1. 2. Configure Settings:
    • Open the Settings app (type "Settings" in the search bar).
    • Go to "Region & Languages".
    • Click "Manage Installed Languages".
    • Ensure "Keyboard input method system" is set to "ibus".
    • Install Chinese, simplified if prompted.
    • Log out and back in.
  2. 3. Add the Input Source:
    • Open Settings and go to Keyboard.
    • Click "Input Sources" and the "+" button.
    • Add the Chinese Pinyin input source.

 

 

===================================================

https://askubuntu.com/questions/1408873/ubuntu-22-04-chinese-simplified-pinyin-input-support



  1. Open Settings, go to Region & Language -> Manage Installed Languages -> Install / Remove languages.
  2. Select Chinese (Simplified). Make sure Keyboard Input method system has Ibus selected. Apply.
  3. Reboot
  4. Log back in, reopen Settings, go to Keyboard.
  5. Click on the "+" sign under Input sources.
  6. Select Chinese (China) and then Chinese (Intelligent Pinyin).

In newer version of Ubuntu, I seem to have to install ibus-pinyin 

$ sudo apt-get install ibus-pinyin 

$ ibus restart

10 May 2017

Only some of Chinese characters' font have changed

Problem: After applying a Chinese font, only some of the Chinese characters' font have changed

Cause: The Chinese text is Simplified Chinese, but I am applying Traditional Chinese fonts.

Solution: First of all, convert the Chinese text from Simplified Chinese into Traditional one. Apply Traditional fonts to them.

21 December 2015

Word with Chinese Contents Can not be Converted into PDF | There is NO PDFMaker | There is NO PDF Menu

If an Office document contains both English and Chinese, then after I have removed the Chinese words in it, it will be able to be converted into PDF.

My Office is 2010 64bit. My Adobe Acrobat is X Pro (10.0.0)

The problems in the subject happen only when your Office is 2010 64bit (32bit Office does not have those problems) and your Adobe Acrobat is X (10.0.0 or older) (10.1 or newer Acrobat does not have those problems.)

Solution: Within Acrobat, click Help | Check for Updates.. After installing the updates, version of Acrobat will become 10.1.16 (on 21 December 2015) or newer.

Install a patch from here -

http://www.adobe.com/support/downloads/product.jsp?product=1&platform=Windows

After you have installed the patch, the problems in the subject will be gone.

25 August 2015

When opening a PDF file, the page flashed and became blank, then closed

When opening a PDF file, the page flashed and became blank, then closed

Solution: Exit the Chinese hand-writing software.

30 June 2015

Chinese are not displayed in vi / vim on Windows OS

Chinese are not displayed in vi / vim on Windows OS

Solution 1:

(1) Change your system locale into 'Chinese (simplified, PRC)'. Restart computer.

(2) Edit the file - C:\Program Files (x86)\Vim\_vimrc, and append the following code -


        " 字符编码{{{
" Vim显示的编码(设置这个不会改变文件的编码){
        if has('win32') || has('win64')
        set encoding=utf-8
        set termencoding=chinese
        endif
" }
" 编辑已存在的文件时的参考文件编码.需要注意顺序,前面的字符集应该比后面的字符集大{
        set fileencodings=ucs-bom,utf-8,cp936,gb18030,big5,euc-jp,euc-kr,latin1
" }
" }}}

Open your text file with gvim. You should be able to read Chinese now. Congratulations!



Solution 2: download WinVi from  http://www.winvi.de/en/email.html . Chinese can be read in WinVi.

Solution 3: install Cygwin. Choose vim when installing Cygwin. Run Cygwin. Open your file with vim.

14 May 2015

Chinese Characters are Displayed as Question Marks

Chinese Characters are Displayed as Question Marks


Cause: You have saved the Chinese text file in ANSI encoding.

Solution: Save the Chinese text file in Unicode encoding.

08 December 2014

How to copy Chinese and English texts from a photo PDF file?

Adobe Acrobat
Tools | Text Recognition | In this file | Edit | Priamary OCR Language,

Choose Chinese, then English will also be OK.

09 July 2014

Chinese displayed as gibberish

To correctly display Chinese, 4 things should be done –

1. When creating the database, Chinese should be defined –

create database dbsettle character set 'gbk' collate 'gbk_chinese_ci';

use dbsettle;

2. On the web page, in the header section, Chinese should be defined –

<meta http-equiv="content-type" content="charset=gbk" />

3. In the programme code, after having connected to the database, Chinese should be defined –

mysql_select_db("dbsettle",$db);

mysql_query("set names gbk", $db);

4. In the sending email section of the code, content type should be defined –

// Always set content-type when sending HTML email

$headers = "MIME-Version: 1.0" . "\r\n";

$headers .= "Content-type:text/html;charset=gbk" . "\r\n";

// More headers

$headers .= 'From: xxxx@cicscanada.com' . "\r\n";

// $headers .= 'Cc: myboss@example.com' . "\r\n";

        // send mail

        mail($to, $subject, $message, $headers)

If all the above is done, then Chinese will correctly be displayed in the email.

01 January 2013

MySQL does not display Chinese

NOTE: Generally, 3 places - (1)  When creating a database, you should include character set; (2) In head section of html code of a web page, you should define character set; (3) after mysql_select_db, you should include mysql_query("set names gbk;"). That is all.

(1) In the file 'my.ini', add the following two lines:

 [client]
default-character-set=gbk

[mysqld]
character-set-server=gbk

Restarted MySQL server.


(2) When creating a database, use -
create database [database name] character set 'gbk' collate 'gbk_chinese_ci';


(3) Run the command 'show create table ...'.

mysql>show create table [table name];

Check what the default default charset is.

If it is not 'gbk', change it into 'gbk'.

The method is -

create table if not exists  '[table name]' (
...
) engine=InnoDB default charset=gbk;


(4) In the PHP page, after the statement -

mysql_select_db("dbname", $db);

Add the following statement -

mysql_query("set names gbk;");


After the above 3 actions, Chinese will be displayed.