The Web Design Group

... Making the Web accessible to all.

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
> Skipping blank lines while reading a file
CharlesEF
post Nov 11 2020, 06:18 PM
Post #1


Programming Fanatic
********

Group: Members
Posts: 1,981
Joined: 27-April 13
From: Edinburg, Texas
Member No.: 19,088



Hi All,

Sounds simple, right? Well, I find that it turns out not to be. Nothing I've tried has worked. So, maybe someone else has a better idea.

The attached sample file is part of a bigger file generated by Excel 2016. It is formatted as UTF-16 LE but I re-save it as UTF-8. The goal of the script is to print only non blank lines but I can't get it to work. sad.gif

Does anyone have a better idea?


Thanks for any and all help,

Charles
Attached File  read_file.php ( 467bytes ) Number of downloads: 391
Attached File  sample.txt ( 16.75k ) Number of downloads: 379

I had to rename sample.sql to sample.txt in order to upload it. You will either have to change it back or change the script.
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
Christian J
post Nov 11 2020, 06:42 PM
Post #2


.
********

Group: WDG Moderators
Posts: 9,628
Joined: 10-August 06
Member No.: 7



QUOTE(CharlesEF @ Nov 12 2020, 12:18 AM) *

Attached File  read_file.php ( 467bytes ) Number of downloads: 391


I get a PHP error when trying to open that file. Surely the forum software can handle attachments with .php extensions? unsure.gif

QUOTE
I had to rename sample.sql to sample.txt in order to upload it.

Are you sure the linebreak characters are still the same as in the original file?

User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
pandy
post Nov 11 2020, 07:24 PM
Post #3


🌟Computer says no🌟
********

Group: WDG Moderators
Posts: 20,716
Joined: 9-August 06
Member No.: 6



What gave you the error? The forum? wacko.gif

No problem for me! Here it is. cool.gif

CODE

<!DOCTYPE HTML>
<html>
<head>
<title>Read Test</title>
</head>
<body>
<?php
$bom = pack("H*", "EFBBBF");
if(($reading = fopen("sample.sql", "r")) !== false)
{
$sql = preg_replace("/^$bom/", "", fgets($reading));
while(!feof($reading))
{
  if(!empty($sql))
  {
   echo("{$sql}<br>");
   $sql = fgets($reading);
  }
}
if(!feof($reading))
{
  echo("Unexpected read error in file." . PHP_EOL);
}
fclose($reading);
}
?>
</body>
</html>


User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
Christian J
post Nov 11 2020, 08:30 PM
Post #4


.
********

Group: WDG Moderators
Posts: 9,628
Joined: 10-August 06
Member No.: 7



QUOTE(pandy @ Nov 12 2020, 01:24 AM) *

What gave you the error? The forum? wacko.gif

Seems my browser opens the PHP file instead of displaying a download dialog, and for some reason the last part of the script got displayed:

CODE
"); $sql = fgets($reading); } } if(!feof($reading)) { echo("Unexpected read error in file." . PHP_EOL); } fclose($reading); } ?>

I did think it was the forum at first. I shouldn't stay up this late. wacko.gif
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
Christian J
post Nov 11 2020, 08:43 PM
Post #5


.
********

Group: WDG Moderators
Posts: 9,628
Joined: 10-August 06
Member No.: 7



QUOTE(CharlesEF @ Nov 12 2020, 12:18 AM) *

The goal of the script is to print only non blank lines but I can't get it to work. sad.gif

Does anyone have a better idea?

How about just deleting superflouos newline characters?
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
pandy
post Nov 11 2020, 08:57 PM
Post #6


🌟Computer says no🌟
********

Group: WDG Moderators
Posts: 20,716
Joined: 9-August 06
Member No.: 6



QUOTE(Christian J @ Nov 12 2020, 02:30 AM) *

QUOTE(pandy @ Nov 12 2020, 01:24 AM) *

What gave you the error? The forum? wacko.gif

Seems my browser opens the PHP file instead of displaying a download dialog, and for some reason the last part of the script got displayed:

CODE
"); $sql = fgets($reading); } } if(!feof($reading)) { echo("Unexpected read error in file." . PHP_EOL); } fclose($reading); } ?>

I did think it was the forum at first. I shouldn't stay up this late. wacko.gif


Opera at it again? But that isn't an error message. It's code that can generate an error message. tongue.gif

We both seem to stay up too late lately. laugh.gif
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
CharlesEF
post Nov 11 2020, 10:37 PM
Post #7


Programming Fanatic
********

Group: Members
Posts: 1,981
Joined: 27-April 13
From: Edinburg, Texas
Member No.: 19,088



QUOTE(Christian J @ Nov 11 2020, 05:42 PM) *

QUOTE(CharlesEF @ Nov 12 2020, 12:18 AM) *

Attached File  read_file.php ( 467bytes ) Number of downloads: 391


I get a PHP error when trying to open that file. Surely the forum software can handle attachments with .php extensions? unsure.gif

QUOTE
I had to rename sample.sql to sample.txt in order to upload it.

Are you sure the linebreak characters are still the same as in the original file?

Hi,
Yes, they are the same line break characters. In fact when I use bin2hex I can see the blank lines contain '0d0a', which is Windows default for lf/cr. I've even tried to test for that sequence but it either crashes my browser or my web server (IIS). First I extract the first 2 characters from the $sql string. Then I use bin2hex on those 2 characters and I get 0d0a. In my if statement I test for ($var != "0d0a") and it crashes my web server, I have to restart it. If I test for ($var != 0x0d0a) then the script runs but still prints the blank lines.

Here is some air code to show what I'm doing.
CODE
while(...)
{
  $lf = bin2hex(substr($sql, 0, 2));
  if($lf != "0d0a") // Will crash IIS
  if($lf != 0x0d0a) // Will work but still prints blank lines
  {
    echo($sql);
    get next line
  }
}


I'm pulling my hair out over this! huh.gif Do you what me to upload the script again? Or did you get it working?

Charles

This post has been edited by CharlesEF: Nov 11 2020, 11:11 PM
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
CharlesEF
post Nov 11 2020, 10:48 PM
Post #8


Programming Fanatic
********

Group: Members
Posts: 1,981
Joined: 27-April 13
From: Edinburg, Texas
Member No.: 19,088



This is part of my script to load data into the database. PHP reads the file and sends each line to MySQL. When MySQL gets a blank line it returns an error message. My script reports each success, failure or warning message to the user. Some files are small but some contain 1000's of lines. I don't want the user to search through 1000's of lines to find a useless error message. That's why I'm trying to skip blank lines.

This script can be used by the user to import their own data into the database. So, I should test for both \n and \r\n cases. (testing with those character doesn't work either) But right now I can't even test for one.
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
CharlesEF
post Nov 12 2020, 02:55 AM
Post #9


Programming Fanatic
********

Group: Members
Posts: 1,981
Joined: 27-April 13
From: Edinburg, Texas
Member No.: 19,088



Let me share what I found so far. This is the PHP code I'm working with:
CODE
<?php
$bom = pack("H*", "EFBBBF");
if(($reading = fopen("sample.sql", "r")) !== false)
{
$sql = preg_replace("/^$bom/", "", fgets($reading));
while(!feof($reading))
{
  $sql = preg_replace('/\s+/', ' ', $sql);
  if(!empty($sql))
  {
   echo("{$sql}<br>");
   $sql = fgets($reading);
  }
}
if(!feof($reading))
{
  echo("Unexpected read error in file." . PHP_EOL);
}
fclose($reading);
}
?>
Basically, I added 1 line:
CODE
$sql = preg_replace('/\s+/', ' ', $sql);
, This does remove all whitespace but I must leave 1 space. If I try to leave no space then IIS will crash. You would think that wouldn't matter because !empty() should catch it and skip the line. Well, it doesn't, it still prints a blank line. I've used strlen() on it and it tells me the blank lines have a length of 1. If I use trim() on it IIS crashes. If I use strlen($sql) != 1 instead of !empty() IIS crashes.

I'm positive that 1 line does remove 0d0a but leaves 1 blank character. Why am I have so much trouble with this?
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
Christian J
post Nov 12 2020, 08:53 AM
Post #10


.
********

Group: WDG Moderators
Posts: 9,628
Joined: 10-August 06
Member No.: 7



QUOTE(CharlesEF @ Nov 12 2020, 04:37 AM) *

Do you what me to upload the script again? Or did you get it working?

It was just user error. blush.gif

Don't know if this user comment is relevant: https://www.php.net/manual/en/function.fgets.php#118733 ?
I'm not familiar with using fgets() for loops, but can't you just test if each line starts with the string "INSERT" (or any non-whitespace character), and otherwise skip that line?

Or (since I don't know fgets) I'd try to replace all double newlines with a single one. This assumes there's no other whitespace characters between the newlines.

Or maybe:
1) use file_get_contents() on sample.sql,
2) explode() its contents by "INSERT INTO",
3) loop through the exploded array (adding back a single newline and "INSERT INTO" for each recreated row).
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
CharlesEF
post Nov 12 2020, 11:37 AM
Post #11


Programming Fanatic
********

Group: Members
Posts: 1,981
Joined: 27-April 13
From: Edinburg, Texas
Member No.: 19,088



I have extracted the 1st character from $sql. I then test the $var == "I", instead of !empty($sql), but it crashes IIS. I need to take a break from this.
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
CharlesEF
post Nov 13 2020, 04:06 AM
Post #12


Programming Fanatic
********

Group: Members
Posts: 1,981
Joined: 27-April 13
From: Edinburg, Texas
Member No.: 19,088



After some much needed sleep I found my problem (sort of). I still think empty() should see " " as empty, I'll check the docs again.

To fix my code I had to change the str_replace (instead of preg_replace I had before, read it's faster) to remove the newlines completely. Then I had to move fgets out of the if statement (if the line is blank you still need to get the next line).

Here is the corrected code:
CODE
<?php
$bom = pack("H*", "EFBBBF");
if(($reading = fopen("sample.sql", "r")) !== false)
{
$sql = preg_replace("/^$bom/", "", fgets($reading));
while(!feof($reading))
{
  $sql = str_replace(array("\n", "\r", "\r\n"), "", $sql);
  if(!empty($sql))
  {
   echo("{$sql}<br>");
  }
  $sql = fgets($reading);
}
if(!feof($reading))
{
  echo("Unexpected read error in file." . PHP_EOL);
}
fclose($reading);
}
?>

Thanks for looking.
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
CharlesEF
post Nov 15 2020, 07:32 PM
Post #13


Programming Fanatic
********

Group: Members
Posts: 1,981
Joined: 27-April 13
From: Edinburg, Texas
Member No.: 19,088



A follow up to !empty not skipping blank lines. As I tested I found that there where 2 spaces at the end of the line. This made me realize the str_replace search order was wrong. It was finding and replacing based on \n then \r, not \r\n. This is the fix:

CODE
  $sql = str_replace(array("\r\n", "\n", "\r"), "", $sql);

This corrects the 2 space problem (now I just remove newlines). But, before removing newlines I tried replacing them with a space. !empty did not skip the blank lines. When I replace newlines with a 0 !empty did skip blank lines. I checked the docs and a space should be considered empty. Oh well, at least it works now.
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post

Reply to this topicStart new topic
2 User(s) are reading this topic (2 Guests and 0 Anonymous Users)
0 Members:

 



- Lo-Fi Version Time is now: 19th March 2024 - 03:42 AM