Explain Codes LogoExplain Codes Logo

How can I read numeric strings in Excel cells as string (not numbers)?

java
dataformatter
formulaevaluator
number-to-text-converter
Alex KataevbyAlex Kataev·Sep 16, 2024
TLDR

To read numeric values in Excel cells as strings in Java, leverage the DataFormatter class of Apache POI. Here is a basic example:

DataFormatter formatter = new DataFormatter(); Sheet sheet = workbook.getSheetAt(0); for (Row row : sheet) { for (Cell cell : row) { // Print the cell value as seen in Excel. Magic? Nope, just Java! System.out.println(formatter.formatCellValue(cell)); } }

This code totally ignores potential numerical characteristics of the original Excel cells and outputs them as pure text, precisely preserving their visual representation.

Beating .toString()'s limits

To avoid blurring distinctions between values like "2" and "2.0", refrain from using the basic .toString() method. Instead, employ the DataFormatter utility:

Cell cell = ... // Your cell object. Hope it's not null! DataFormatter formatter = new DataFormatter(); String textValue = formatter.formatCellValue(cell); // For bonus points, "2.0" is not the same as "2" now!

This will ensure correct handling of values that need to remain distinctly formatted, such as account numbers, identification codes, etc.

Strategy for formulas

Excel formulas pose an extra layer of complexity. A combination of a FormulaEvaluator and DataFormatter is needed:

Workbook workbook = ... // Put your workbook object here. Make sure it's not empty, though! FormulaEvaluator evaluator = workbook.getCreationHelper().createFormulaEvaluator(); DataFormatter formatter = new DataFormatter(); for (Row row : workbook.getSheetAt(0)) { for (Cell cell : row) { // Formulas are tough, so we give them some special treatment if (cell.getCellType() == CellType.FORMULA) { System.out.println(formatter.formatCellValue(cell, evaluator)); } else { System.out.println(formatter.formatCellValue(cell)); } } }

This technique accurately converts formulas to their presented values, not their underlying formula strings.

Workbook type: Know your tools!

The correct Workbook implementation (whether HSSF for .xls files or XSSF for .xlsx files) impacts the FormulaEvaluator selection—either HSSFFormulaEvaluator or XSSFFormulaEvaluator.

For mighty large Excel files, you'll want the SXSSFWorkbook streaming reader from Apache POI. It helps to keep memory use in check.

Dialing it up with NumberToTextConverter

Revisionist historians, focus! For historically accurate value representations, Apache POI's NumberToTextConverter utility is your best bud:

double cellValue = ... // Here's the numeric value in a cell. No funny business! String preciseText = NumberToTextConverter.toText(cellValue); // Precision at your fingertips, no rounding up, no guesses.

This method preserves the exact original numeric state, whether an integer, a decimal, a floating-point weirdo, or something in scientific notation. Zero distortion!

Cell type iteration

Being aware of the cell type prior to conversion is vital. So, you'd always check getCellType():

Cell cell = ... // Your cell object goes here. It's excited to be here! switch (cell.getCellType()) { case NUMERIC: // DataFormatter or NumberToTextConverter enters the chat break; case STRING: // Just get the value, all is cool! break; case FORMULA: // Evaluate before you judge! break; // Didn't forget other cell types, did you? }