| From: Steve Upton | Date Sent: 2010-02-09 12:43:29 |
| Subject: filename containing '#' mucking up... | To: Lasso Talk |
| Navigation: First Message | Previous Message | Next Message | Last Message | |
Hi all, again,
This is an odd one. Our server handles many uploads a day. Perhaps this is the first time we've seen this file name but it seems that Lasso messes up processing any file that has a "#" in the filename.
I think it might be at the stage where we copy it from the upload area to our storage area - I'm in the process of isolating the problem.
Lasso 8.5.5 on OS X Server 10.4.x
has anyone else seen this problem?
thanks,
Steve
| From: Eric Landmann | Date Sent: 2010-02-09 12:43:58 |
| Subject: Re: filename containing '#' mucking up... | To: Lasso Talk |
| Navigation: First Message | Previous Message | Next Message | Last Message | |
On 2/9/10 at 2:43 PM, upton@[Protected] (Steve Upton) wrote:
>This is an odd one. Our server handles many uploads a day.
>Perhaps this is the first time we've seen this file name but it
>seems that Lasso messes up processing any file that has a "#"
>in the filename.
>
>I think it might be at the stage where we copy it from the upload area to our storage area - I'm in the process of isolating the problem.
>
>Lasso 8.5.5 on OS X Server 10.4.x
>
>has anyone else seen this problem?
We avoid this situation using a bit of regex:
Local:'fileReal' =
(String_ReplaceRegExp:(Encode_StrictURL: $fileRealRaw->(Split:'\\')->Last),
-Find='%[\\da-f]{2}',-Replace='');
#fileReal->Trim;
That should fix it (just tested it).
-------------------------------------------------------
Eric Landmann
Landmann InterActive, 6000 Gisholt Dr. #204, Madison, WI 53713 USA
Voice 608-257-1558 iChat: landintraktv
Content Management Systems | eCommerce | Custom Development
| From: Steve Upton | Date Sent: 2010-02-09 12:56:54 |
| Subject: Re: filename containing '#' mucking up... | To: Lasso Talk |
| Navigation: First Message | Previous Message | Next Message | Last Message | |
At 2:43 PM -0600 2/9/10, Eric Landmann wrote:
>On 2/9/10 at 2:43 PM, upton@[Protected] (Steve Upton) wrote:
>
>>This is an odd one. Our server handles many uploads a day. Perhaps this is the first time we've seen this file name but it seems that Lasso messes up processing any file that has a "#" in the filename.
>>
>>I think it might be at the stage where we copy it from the upload area to our storage area - I'm in the process of isolating the problem.
>>
>>Lasso 8.5.5 on OS X Server 10.4.x
>>
>>has anyone else seen this problem?
>
>We avoid this situation using a bit of regex:
>
>Local:'fileReal' =
> (String_ReplaceRegExp:(Encode_StrictURL: $fileRealRaw->(Split:'\\')->Last),
> -Find='%[\\da-f]{2}',-Replace='');
>#fileReal->Trim;
>
>That should fix it (just tested it).
Thanks for the response Eric,
But not being a Regex reader (yet, I have a book just not the time) can you explain what the code is doing?
I don't see a # sign in the regex so are you cleaning at a more general level or encoding or...
also, is # normally a problem? It's not a reserved character in any normal paths and it's a normal ASCII character. It seems odd that it causes problems..
thanks
Steve
| From: Steve Piercy - Web Site Builder | Date Sent: 2010-02-09 13:05:52 |
| Subject: Re: filename containing '#' mucking up... | To: Lasso Talk |
| Navigation: First Message | Previous Message | Next Message | Last Message | |
On 2/9/10 at 12:56 PM, upton@[Protected] (Steve Upton) pronounced:
>At 2:43 PM -0600 2/9/10, Eric Landmann wrote:
>>On 2/9/10 at 2:43 PM, upton@[Protected] (Steve Upton) wrote:
>>
>>>This is an odd one. Our server handles many uploads a day. Perhaps this is the
>first time we've seen this file name but it seems that Lasso
>messes up processing any file that has a "#" in the filename.
>>>
>>>I think it might be at the stage where we copy it from the upload area to our
>storage area - I'm in the process of isolating the problem.
>>>
>>>Lasso 8.5.5 on OS X Server 10.4.x
>>>
>>>has anyone else seen this problem?
>>
>>We avoid this situation using a bit of regex:
>>
>>Local:'fileReal' =
>>(String_ReplaceRegExp:(Encode_StrictURL: $fileRealRaw->(Split:'\\')->Last),
>>-Find='%[\\da-f]{2}',-Replace='');
>>#fileReal->Trim;
>>
>>That should fix it (just tested it).
>
>Thanks for the response Eric,
>
>But not being a Regex reader (yet, I have a book just not the
>time) can you explain what the code is doing?
>
>I don't see a # sign in the regex so are you cleaning at a more
>general level or encoding or...
>
>also, is # normally a problem? It's not a reserved character in
>any normal paths and it's a normal ASCII character. It seems
>odd that it causes problems..
Within the context of an URL, # is reserved as a named anchor
within a page:
http://mysite.com/index.lasso#jumptoanchor
so it can cause issues in that context. However Lasso should be
able to process the file regardless of its real name.
Eric's method replaces any URL-hostile character with one that
is encoded to be URL-friendly.
I like to make all file uploads lower-case, and replace any
non-alpha, non-number, non-period character with an underscore:
// set new filename
$newfile = $files->find('upload.realname')->split('/')->last->split('\\')->last;
$newfile->lowercase;
$newfile = string_replaceregexp($newfile,
-find='[^a-z0-9.]', -replace='_');
It's a matter of personal preference or business need, and there
is no "right" way.
--steve
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
-- --
Steve Piercy Web Site Builder
Soquel, CA
<web@[Protected]> <http://www.StevePiercy.com/>
| From: Eric Landmann | Date Sent: 2010-02-09 13:04:14 |
| Subject: Re: filename containing '#' mucking up... | To: Lasso Talk |
| Navigation: First Message | Previous Message | Next Message | Last Message | |
On 2/9/10 at 2:56 PM, upton@[Protected] (Steve Upton) wrote:
>At 2:43 PM -0600 2/9/10, Eric Landmann wrote:
>>On 2/9/10 at 2:43 PM, upton@[Protected] (Steve Upton) wrote:
>>
>>>This is an odd one. Our server handles many uploads a day. Perhaps this is the first time we've seen this file name but it seems that Lasso messes up
>processing any file that has a "#" in the filename.
>>>
>>>I think it might be at the stage where we copy it from the upload area to our storage area - I'm in the process of isolating the problem.
>>>
>>>Lasso 8.5.5 on OS X Server 10.4.x
>>>
>>>has anyone else seen this problem?
>>
>>We avoid this situation using a bit of regex:
>>
>>Local:'fileReal' =
>>(String_ReplaceRegExp:(Encode_StrictURL: $fileRealRaw->(Split:'\\')->Last),
>>-Find='%[\\da-f]{2}',-Replace='');
>>#fileReal->Trim;
>>
>>That should fix it (just tested it).
>
>Thanks for the response Eric,
>
>But not being a Regex reader (yet, I have a book just not the time) can you explain what the code is doing?
>
>I don't see a # sign in the regex so are you cleaning at a more general level or encoding or...
>
>also, is # normally a problem? It's not a reserved character in any normal paths and it's a normal ASCII character. It seems odd that it causes problems..
Shoot, thought I was going to get away clean. That bit is
something I picked up along the line
(http://www.listsearch.com/Lasso/Thread/index.lasso?10511#191599).
Regex is voodoo to me, so I'm afraid I can't help with that
question. Just know that it works.
--Eric
| From: Steve Upton | Date Sent: 2010-02-09 13:17:58 |
| Subject: Re: filename containing '#' mucking up... | To: Lasso Talk |
| Navigation: First Message | Previous Message | Next Message | Last Message | |
At 1:05 PM -0800 2/9/10, Steve Piercy - Web Site Builder wrote:
>>
>>But not being a Regex reader (yet, I have a book just not the time) can you explain what the code is doing?
>>
>>I don't see a # sign in the regex so are you cleaning at a more general level or encoding or...
>>
>>also, is # normally a problem? It's not a reserved character in any normal paths and it's a normal ASCII character. It seems odd that it causes problems..
>
>Within the context of an URL, # is reserved as a named anchor within a page:
>
>http://mysite.com/index.lasso#jumptoanchor
>
>so it can cause issues in that context. However Lasso should be able to process the file regardless of its real name.
>
>Eric's method replaces any URL-hostile character with one that is encoded to be URL-friendly.
>
>I like to make all file uploads lower-case, and replace any non-alpha, non-number, non-period character with an underscore:
>
> // set new filename
> $newfile = $files->find('upload.realname')->split('/')->last->split('\\')->last;
> $newfile->lowercase;
> $newfile = string_replaceregexp($newfile, -find='[^a-z0-9.]', -replace='_');
>
>It's a matter of personal preference or business need, and there is no "right" way.
OK, well, here's an update.
The problem is I get a file permission problem from Lasso when I try to copy the file from the upload area to the storage area.
The exact same file, without the # in the middle, copies without any problem.
I realize I can filter out the character but the resource created in our system uses the file name and users will be confused / upset / etc.
And hey, it's only a flippin # sign... weird..
URL encoding shouldn't be an issue as URL access is not entering into things here. I can appreciate that encoding it for download is a good idea but this is just weird..
anyone?
Eric, thanks for the honesty ;-) regex is something I keep bouncing off. I find I can understand small bits for small lengths of time but reading someone else's code is next to impossible.
Steve
| From: Steve Piercy - Web Site Builder | Date Sent: 2010-02-09 14:08:18 |
| Subject: Re: filename containing '#' mucking up... | To: Lasso Talk |
| Navigation: First Message | Previous Message | Next Message | Last Message | |
On 2/9/10 at 1:17 PM, upton@[Protected] (Steve Upton) pronounced:
>The problem is I get a file permission problem from Lasso when
>I try to copy the file from the upload area to the storage area.
>
>The exact same file, without the # in the middle, copies without any problem.
Bonk. Okay, I tried it with a file named foo#foo.jpg with my
file_upload code sample from my guide.
http://stevepiercy.com/lasso_stuff/file_perms.lasso
The result was unexpected:
* A file named "foo" appears in destination folder in Finder.
* When I view the uploaded file in Preview (had to force it
open), it appears fine.
* The link that allows you to view the file in the web browser
contains the proper filename "foo#foo.jpg".
* When I click the link, I view a bytes type. This is probably
because Apache does not know what to do with a file that has no
extension ("foo").
I have no idea what is going on either.
I modified my upload code to strip out the # and then it worked,
but that does not allow users to use # in the filename.
$this_file = $files->find('upload.realname')->split('/')->last->split('\\')->last;
$this_file = string_replaceregexp($this_file,
-find='[^a-z0-9.]', -replace='_');
Maybe someone else can explain?
--steve
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
-- --
Steve Piercy Web Site Builder
Soquel, CA
<web@[Protected]> <http://www.StevePiercy.com/>
| From: Johan Solve | Date Sent: 2010-02-09 14:09:19 |
| Subject: Re: filename containing '#' mucking up... | To: Lasso Talk |
| Navigation: First Message | Previous Message | Next Message | Last Message | |
On Tue, Feb 9, 2010 at 9:56 PM, Steve Upton <upton@[Protected]> wrote:
>>Local:'fileReal' =
>> (String_ReplaceRegExp:(Encode_StrictURL: $fileRealRaw->(Split:'\\')->Last),
>> -Find='%[\\da-f]{2}',-Replace='');
>>#fileReal->Trim;
>>
>>That should fix it (just tested it).
>
> Thanks for the response Eric,
>
> But not being a Regex reader (yet, I have a book just not the time) can you explain what the code is doing?
First it encodes all suspiciuos characters using encode_stricturl.
Then the regex searches for anything looking like a url encoded entity
by searching for % followed by 2 characters 0-9 or a-f. Those matches
are removed (replaced by an empty string).
--
Mvh
Johan Sölve
____________________________________
Montania System AB
Halmstad Stockholm Malmö
http://www.montania.se
Johan Sölve
Mobil +46 709-51 55 70
johan@[Protected]
Kristinebergsvägen 17, S-302 41 Halmstad, Sweden
Telefon +46 35-136800 | Fax +46 35-136801
| From: Steve Upton | Date Sent: 2010-02-09 14:22:26 |
| Subject: Re: filename containing '#' mucking up... | To: Lasso Talk |
| Navigation: First Message | Previous Message | Next Message | Last Message | |
At 11:09 PM +0100 2/9/10, Johan Solve wrote:
>On Tue, Feb 9, 2010 at 9:56 PM, Steve Upton <upton@[Protected]> wrote:
>>>Local:'fileReal' =
>>> (String_ReplaceRegExp:(Encode_StrictURL: $fileRealRaw->(Split:'\\')->Last),
>>> -Find='%[\\da-f]{2}',-Replace='');
>>>#fileReal->Trim;
>>>
>>>That should fix it (just tested it).
>>
>> Thanks for the response Eric,
>>
>> But not being a Regex reader (yet, I have a book just not the time) can you explain what the code is doing?
>
>First it encodes all suspiciuos characters using encode_stricturl.
>Then the regex searches for anything looking like a url encoded entity
>by searching for % followed by 2 characters 0-9 or a-f. Those matches
>are removed (replaced by an empty string).
OK cool, thanks for the decode.
But doesn't that mean that spaces and ampersands and (gasp) # signs will be removed?
That seems a bit harsh to strip fairly normal characters that don't typically cause file system problems....
Which brings us back to doh! Our friend the #
thanks for the tests Steve P. At least I know I'm not completely nuts...
Steve
| From: Eric Landmann | Date Sent: 2010-02-09 14:35:52 |
| Subject: Re: filename containing '#' mucking up... | To: Lasso Talk |
| Navigation: First Message | Previous Message | Next Message | Last Message | |
On 2/9/10 at 4:22 PM, upton@[Protected] (Steve Upton) wrote:
>At 11:09 PM +0100 2/9/10, Johan Solve wrote:
>>On Tue, Feb 9, 2010 at 9:56 PM, Steve Upton <upton@[Protected]> wrote:
>>>>Local:'fileReal' =
>>>> (String_ReplaceRegExp:(Encode_StrictURL: $fileRealRaw->(Split:'\\')->Last),
>>>> -Find='%[\\da-f]{2}',-Replace='');
>>>>#fileReal->Trim;
>>>>
>>>>That should fix it (just tested it).
>>>
>>> Thanks for the response Eric,
>>>
>>> But not being a Regex reader (yet, I have a book just not the time) can you explain what the code is doing?
>>
>>First it encodes all suspiciuos characters using encode_stricturl.
>>Then the regex searches for anything looking like a url encoded entity
>>by searching for % followed by 2 characters 0-9 or a-f. Those matches
>>are removed (replaced by an empty string).
>
>OK cool, thanks for the decode.
>
>But doesn't that mean that spaces and ampersands and (gasp) # signs will be removed?
>
>That seems a bit harsh to strip fairly normal characters that don't typically cause file system problems....
>
>Which brings us back to doh! Our friend the #
>
>thanks for the tests Steve P. At least I know I'm not completely nuts...
We strip out all the other stuff to avoid just these sorts of
problems. In itPage for both image uploads and file uploads we
append a three-digit extension to the root of the filename to
make it unique after it goes through the cleansing regex.
Some Great Photo #1.jpg -> SomeGreatPhoto1_b3f.jpg
The "b3f" part is the variable part that is created from a
custom tag. None of that solves your problem, but thought it
might explain our approach.
--Eric
| From: Bil Corry | Date Sent: 2010-02-09 18:12:40 |
| Subject: Re: filename containing '#' mucking up... | To: Lasso Talk |
| Navigation: First Message | Previous Message | Next Message | Last Message | |
Steve Upton wrote on 2/9/2010 1:17 PM:
> I realize I can filter out the character but the resource created in our system uses the file name and users will be confused / upset / etc.
If preserving the user-provided filename is of utmost importance, a better solution is to store the original filename in a database, and name the file on disk with a random filename. When you serve the file, you can use the original filename provided by the user -- the user will never know it's saved with a random filename on disk. The caveat, of course, is Lasso must serve the file, not Apache/FTP/etc.
- Bil
| From: Steve Piercy - Web Site Builder | Date Sent: 2010-02-09 18:18:59 |
| Subject: Re: filename containing '#' mucking up... | To: Lasso Talk |
| Navigation: First Message | Previous Message | Next Message | Last Message | |
On 2/9/10 at 6:12 PM, bil@[Protected] (Bil Corry) pronounced:
>Steve Upton wrote on 2/9/2010 1:17 PM:
>>I realize I can filter out the character but the resource created in our system
>uses the file name and users will be confused / upset / etc.
>
>If preserving the user-provided filename is of utmost
>importance, a better solution is to store the original filename
>in a database, and name the file on disk with a random
>filename. When you serve the file, you can use the original
>filename provided by the user -- the user will never know it's
>saved with a random filename on disk. The caveat, of course,
>is Lasso must serve the file, not Apache/FTP/etc.
You can have your räksmörgås and eat it too.
--steve
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
-- --
Steve Piercy Web Site Builder
Soquel, CA
<web@[Protected]> <http://www.StevePiercy.com/>
| From: Johan Solve | Date Sent: 2010-02-10 03:34:35 |
| Subject: Re: filename containing '#' mucking up... | To: Lasso Talk |
| Navigation: First Message | Previous Message | Next Message | Last Message | |
On Wed, Feb 10, 2010 at 3:18 AM, Steve Piercy - Web Site Builder
<Web@[Protected]> wrote:
> On 2/9/10 at 6:12 PM, bil@[Protected] (Bil Corry) pronounced:
>
>> Steve Upton wrote on 2/9/2010 1:17 PM:
>>>
>>> I realize I can filter out the character but the resource created in our
>>> system
>>
>> uses the file name and users will be confused / upset / etc.
>>
>> If preserving the user-provided filename is of utmost importance, a better
>> solution is to store the original filename in a database, and name the file
>> on disk with a random filename. When you serve the file, you can use the
>> original filename provided by the user -- the user will never know it's
>> saved with a random filename on disk. The caveat, of course, is Lasso must
>> serve the file, not Apache/FTP/etc.
>
> You can have your räksmörgås and eat it too.
Yeah, rksmrgs doesn't taste any good.
--
Mvh
Johan Sölve
____________________________________
Montania System AB
Halmstad Stockholm Malmö
http://www.montania.se
Johan Sölve
Mobil +46 709-51 55 70
johan@[Protected]
Kristinebergsvägen 17, S-302 41 Halmstad, Sweden
Telefon +46 35-136800 | Fax +46 35-136801