Thursday, September 25, 2014

Character Encoding in Eclipse

The Problem

Sometime we see texts appearing in Multi language in a webpage like in Google homepage (https://www.google.co.in/) i.e. हिन्दी বাংলা తెలుగు मराठी தமிழ் ગુજરાતી ಕನ್ನಡ മലയാളം ਪੰਜਾਬੀ. If we try to print these texts using Selenium WebDriver in Eclipse Console it might give you weird outputs like ???.

Sample Code:

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;

public class Test {

       public static void main(String[] args) {
             
           WebDriver driver = new FirefoxDriver();
           driver.get("https://www.google.co.in");

           List<WebElement> links = driver.findElements(By.xpath("//*[@id='addlang']/a"));

           for(int i=0;i<links.size();i++){
              System.out.println(links.get(i).getText());
           }

       }

}

Output
??????
?????
??????
?????
?????
???????
?????
??????
??????

Why this happens

This happens because of character encoding and output may vary from IDE to IDE based on settings placed for Character Encoding. These regional language characters come under UTF-8 Character Set and if your editor have different settings for Character Encoding other than UTF-8, the above problem will appear.

How can I resolve the problem

In Eclipse, we can resolve the problem by setting Character Encoding in the following way:

File level Settings:
Right click on specific file (e.g. Test.java) -> Properties. Under Text file encoding section -> Select Other radio, Select UTF-8 from combo -> Click OK button.

Project level Settings:

Right click on Project -> Properties and then at the Text file encoding section: Select Other radio, Select UTF-8 from combo -> Lastly click OK button.

Eclipse level Settings:

Window -> Preferences -> General -> Workspace. In Text file encoding section : Select Other radio, Select UTF-8 from combo -> Lastly click OK button.

Now, again re-execute the above code. It should give you all the languages like texts. 


Final Output


हिन्दी

বাংলা
తెలుగు
मराठी
தமிழ்
ગુજરાતી
ಕನ್ನಡ
മലയാളം
ਪੰਜਾਬੀ

Hope this helps!

No comments:

Post a Comment