[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [coldsync-hackers] Patch: Palm m50x USB support (Linux)



Greg KH wrote:

> > Known bugs:
> > Sometimes coldsync hangs. This seems related to the problems
> > that have been reported for the USB Visor support. It looks
> > like data from the Palm to coldsync gets lost somewhere
> > on the way. There is maybe something wrong with the USB
> > serial/visor driver?
> 
> Coldsync only seems to have this "problem" so I'm thinking that it's
> something in the program.  Pilot-xfer doesn't seem to have this problem
> (i.e. I can't reproduce it.)

Damn. This means I hacked the wrong program! (not really, coldsync
was way more hacker friendly)

> If you have any dumps that show the problem is in the visor driver,
> please let me know.  I'd be glad to fix them.  Only ones I've seen are
> ones that the data didn't seem to come in at all in on the USB
> connection, which tends to mean the Palm itself is messing up somewhere
> (nah, there can't be bugs in the Palm...)

Since I spent the weekend getting my own driver (palm.c) into 2.4.5
(unfortunately I saw only yesterday evening it was already in 2.4.6),
I thought it was my mistake and did some extensive diagnostics of
that problem and another mystery.

The first mystery is that the first 6B (which happen to be a header)
of communication on the ttyUSB1 pipe from palm to PC are missing!

This is the relevant output from the USBSnoopy program:

00000210        0.39925600      >>>>>>> URB 7 going down...     
00000211        0.39926960      --
URB_FUNCTION_BULK_OR_INTERRUPT_TRANSFER:     
00000212        0.39928160        PipeHandle           = c1600410       
00000213        0.39929920        TransferFlags        = 00000003
(USBD_TRANSFER_DIRECTION_IN, USBD_SHORT_TRANS
FER_OK) 
00000214        0.39931040        TransferBufferLength = 00000006       
00000215        0.39932000        TransferBuffer       = 00000000       
00000216        0.39933120        TransferBufferMDL    = c140e750       
00000217        0.39934160        UrbLink              = 00000000       
00000218        0.40127920              
00000219        0.40128640      <<<<<<< URB 7 coming back...    
00000220        0.40130480      --
URB_FUNCTION_BULK_OR_INTERRUPT_TRANSFER:     
00000221        0.40132160        PipeHandle           = c1600410       
00000222        0.40134240        TransferFlags        = 00000003
(USBD_TRANSFER_DIRECTION_IN, USBD_SHORT_TRANS
FER_OK) 
00000223        0.40135680        TransferBufferLength = 00000006       
00000224        0.40136960        TransferBuffer       = 00000000       
00000225        0.40138400        TransferBufferMDL    = c140e750       
00000226        0.40139600              
00000227        0.40140240          0000:       
00000228        0.40143040       01 ff 00 00 00 16      
00000229        0.40144320        UrbLink              = 00000000       
00000230        0.41361520      UsbSnoop -
IRP_MJ_INTERNAL_DEVICE_CONTROL, IOCTL_INTERNAL_USB_SUBMIT_URB       
        
00000231        0.41362720              
00000232        0.41363120      >>>>>>> URB 8 going down...     
00000233        0.41364480      --
URB_FUNCTION_BULK_OR_INTERRUPT_TRANSFER:     
00000234        0.41365600        PipeHandle           = c1600410       
00000235        0.41367280        TransferFlags        = 00000003
(USBD_TRANSFER_DIRECTION_IN, USBD_SHORT_TRANS
FER_OK) 
00000236        0.41368400        TransferBufferLength = 00000016       
00000237        0.41369360        TransferBuffer       = 00000000       
00000238        0.41370480        TransferBufferMDL    = c140e750       
00000239        0.41371520        UrbLink              = 00000000       
00000240        0.41558560              
00000241        0.41559200      <<<<<<< URB 8 coming back...    
00000242        0.41561200      --
URB_FUNCTION_BULK_OR_INTERRUPT_TRANSFER:     
00000243        0.41562720        PipeHandle           = c1600410       
00000244        0.41564880        TransferFlags        = 00000003
(USBD_TRANSFER_DIRECTION_IN, USBD_SHORT_TRANS
FER_OK) 
00000245        0.41566320        TransferBufferLength = 00000016       
00000246        0.41567680        TransferBuffer       = 00000000       
00000247        0.41569040        TransferBufferMDL    = c140e750       
00000248        0.41570240              
00000249        0.41570880          0000:       
00000250        0.41577520       90 01 00 00 00 00 00 00 00 20 00 00 00
08 00 00        
00000251        0.41578160          0010:       
00000252        0.41580880       01 00 00 00 00 00      
00000253        0.41582160        UrbLink              = 00000000       
00000254        0.69139680      UsbSnoop -
IRP_MJ_INTERNAL_DEVICE_CONTROL, IOCTL_INTERNAL_USB_SUBMIT_URB       

And this is what the kernel says about this:

Jul 15 22:28:32 driewerf kernel: usbserial.c: serial_open 
Jul 15 22:28:32 driewerf kernel: visor.c: visor_open - port 1 
Jul 15 22:28:32 driewerf kernel: usbserial.c: serial_ioctl - port 1, cmd
0x5401 
Jul 15 22:28:32 driewerf kernel: visor.c: visor_ioctl - port 1, cmd
0x5401 
Jul 15 22:28:32 driewerf kernel: usbserial.c: serial_ioctl - port 1, cmd
0x5402 
Jul 15 22:28:32 driewerf kernel: visor.c: visor_ioctl - port 1, cmd
0x5402 
Jul 15 22:28:32 driewerf kernel: usbserial.c: serial_set_termios - port
1 
Jul 15 22:28:32 driewerf kernel: visor.c: visor_set_termios - port 1 
Jul 15 22:28:32 driewerf kernel: visor.c: visor_set_termios - nothing to
change... 
Jul 15 22:28:32 driewerf kernel: usbserial.c: serial_ioctl - port 1, cmd
0x5401 
Jul 15 22:28:32 driewerf kernel: visor.c: visor_ioctl - port 1, cmd
0x5401 
Jul 15 22:28:32 driewerf kernel: visor.c: visor_read_bulk_callback -
port 1 
Jul 15 22:28:32 driewerf kernel: visor.c: visor_read_bulk_callback -
length = 22, data = 90 01 
00 00 00 00 00 00 00 20 00 00 00 08 00 00 01 00 00 00 00 00  

How is this possible?! Currently I work around this, since the first
messages
are ritual.

Then everything works fine, except for when it doesn't :)

The symptons are always the same: during a bulk read the usb device
seems
to have stopped sending prematurely. And this can be way prematuraly,
not just
off by one urb (of 64B). Now the NetSync protocol is like any other
sensible
protocol in that it provides the total length of the packet in its
header.
In this way, to me, the bug could be in coldsync, but then must happen
after the reception of the header (which indicates the length) and
during the recpetion of the data. The following is this code, quoted
from netsync.c:

        /* Read packet header */
        err = read(pconn->fd, hdr_buf, NETSYNC_HDR_LEN);
        if (err < 0)
        {
                 fprintf(stderr, _("Error reading NetSync packet
header.\n"));
                 perror("read");
                 return -1;    /* XXX - Ought to retry */
        } else if (err != NETSYNC_HDR_LEN)
        {
                 fprintf(stderr,
                         _("Error: only read %d bytes of NetSync packet
"
                           "header.\n"),
                         err);
                 return -1;    /* XXX - Ought to continue reading */
        }

        /* Parse the header */
        rptr = hdr_buf;
        hdr.cmd = get_ubyte(&rptr);
        hdr.xid = get_ubyte(&rptr);
        hdr.len = get_udword(&rptr);

        NET_TRACE(5)
                fprintf(stderr,
                        "Got header: cmd 0x%02x, xid 0x%02x, len
0x%08lx\n",
                        hdr.cmd, hdr.xid, hdr.len);

        /* Read the payload */
        NET_TRACE(5)
                fprintf(stderr, "Reading packet data\n");
        want = hdr.len;
        got = 0;
        while (want > got)
        {
                err = read(pconn->fd, pconn->net.inbuf+got, want-got);
                if (err < 0)
                {
                        perror("netsync_read: read");
                        palm_errno = PALMERR_SYSTEM;
                        return -1;
                }
                if (err == 0)
                {
                        NET_TRACE(5)
                                fprintf(stderr, "EOF in packet.\n");
                        palm_errno = PALMERR_EOF;
                        return 0;
                }
                got += err;
                NET_TRACE(6)
                        fprintf(stderr, "want: %ld, got: %ld\n", want,
got);
        }

See below a recording of a typical situation:

Inside netsync_read()
Got header: cmd 0x01, xid 0xac, len 0x00000436
Reading packet data
want: 1078, got: 64
want: 1078, got: 128
want: 1078, got: 192
want: 1078, got: 256
want: 1078, got: 384
want: 1078, got: 448
want: 1078, got: 512
want: 1078, got: 576
want: 1078, got: 640
want: 1078, got: 704
want: 1078, got: 768
want: 1078, got: 832

and then it hangs.

These are the last things the kernel tells, starting from the last
write:

Jul 16 19:41:21 driewerf kernel: usbserial.c: serial_write - port 1, 14
byte(s) 
Jul 16 19:41:21 driewerf kernel: visor.c: visor_write - port 1 
Jul 16 19:41:21 driewerf kernel: visor.c: visor_write - length = 14,
data = 20 01 20 0a 24 00 0
0 00 01 51 00 00 ff ff  
Jul 16 19:41:21 driewerf kernel: visor.c: visor_write_bulk_callback -
port 1 
Jul 16 19:41:21 driewerf kernel: visor.c: visor_write_bulk_callback -
port 1 
Jul 16 19:41:21 driewerf kernel: usbserial.c: port_softint - port 1 
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
port 1 
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
length = 6, data = 01 ac 0
0 00 04 36  
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
port 1 
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
length = 64, data = a0 01 
00 00 a0 00 04 2e 00 00 01 51 00 c3 04 24 00 00 06 01 20 01 fe 04 13 96
22 00 00 00 00 96 00 22
 00 4c c0 00 04 02 00 00 00 00 00 00 00 10 00 ff ff ff 01 fc fc fc 02 f7
f7 f7 03 f1 f1 f1  
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
port 1 
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
length = 64, data = 04 e7 
e7 e7 05 de de de 06 d2 d2 d2 07 c5 c5 c5 08 b8 b8 b8 09 ab ab ab 0a 9e
9e 9e 0b 94 94 94 0c 89
 89 89 0d 7b 7b 7b 0e 70 70 70 0f 66 66 66 03 c1 ff 00 00 00 00 00 00 00
00 ff 00 00 00 00  
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
port 1 
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
length = 64, data = 00 00 
00 00 ff 00 00 00 00 00 00 00 00 ff 00 00 00 00 00 00 00 00 ff 00 00 00
00 00 00 00 00 ff 00 00
 00 00 00 00 00 00 ff 00 00 00 00 00 00 00 00 ff 00 00 00 00 00 00 00 00
ff 00 00 00 00 00  
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
port 1 
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
length = 64, data = 00 00 
00 f0 00 00 00 00 03 03 9c ff b9 99 99 99 99 99 99 99 fc 99 99 99 99 ab
c8 00 00 00 00 00 00 00
 03 6a 63 ff 00 00 00 00 00 00 00 00 fe 00 00 00 00 00 37 95 00 00 00 00
00 00 00 07 05 d3  
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
port 1 
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
length = 64, data = 00 00 
07 00 3d 30 00 00 00 00 00 00 00 06 09 30 00 03 06 80 00 00 00 00 00 00
00 06 08 00 f8 33 33 33
 33 30 22 03 00 00 00 00 00 00 00 00 05 07 05 fc de ee ee ee ca 50 70 39
cc a5 00 00 00 00  
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
port 1 
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
length = 64, data = 00 00 
00 00 fc df ff ff ff ff f7 f8 08 ff ff ff a3 00 00 00 10 50 00 00 00 01
00 86 0a ff 60 8c 9f fd
 30 00 00 00 30 06 b0 00 00 00 00 83 05 e3 06 8c ff ff 80 00 0c 3d a0 00
36 6f a0 0c e7 00  
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
port 1 
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
length = 64, data = 07 58 
76 80 80 56 00 83 03 f7 09 04 c0 00 0c 3f b0 00 36 7f b0 0d f7 00 07 0a
07 f3 80 b8 00 03 f9 0b
 04 e0 00 0c 05 30 00 06 05 50 00 07 09 06 a8 80 b7 00 02 fa 04 d3 02 33
21 33 03 e1 30 03  
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
port 1 
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
length = 64, data = 55 03 
c6 55 30 03 53 34 35 53 30 65 35 53 08 5a 80 57 00 83 05 f8 09 04 c0 07
07 dd 30 6d 08 f9 0f b0
 0b f1 e5 7f ff d5 8e c6 ff e9 0c f7 7c 09 ff fd 77 d5 f7 07 df ff a3 00
00 00 80 00 00 01  
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
port 1 
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
length = 64, data = 05 04 
80 07 00 df 70 6d 0c f7 3f a0 0c fb fb ec ad fe 30 09 ff e0 ca cf df 7d
bf eb ad ff f3 03 f8 af
 ea bc fe 70 00 00 83 07 f6 00 8c 9f fd 30 03 af a0 6c 5f e0 0f b0 fb ff
a0 03 df a0 6f f8  
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
port 1 
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
length = 64, data = e0 00 
09 ff fd 08 ff 60 00 cf f5 08 f8 ff 50 00 cf c0 00 00 82 08 f5 8c 0b b3
00 03 7f d3 6c af 80 3f
 a0 7b 30 00 8f d0 9f c0 60 00 cf ed 3d fa 00 6f f3 0c d8 f9 00 5e f8 00
00 82 0a f3 d8 03  
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
port 1 
Jul 16 19:41:21 driewerf kernel: visor.c: visor_read_bulk_callback -
length = 64, data = bf e7 
00 03 0c f8 ec 05 fd 30 0f b0 db fa 00 6f f3 cf 80 20 8f cd 5f f3 0e f5
5e 98 f5 0a fc 00 00 80 

Also, noteworthy, and which may be a hint: when the palm finally
timeouts or
I disconnect the palm from the cradle (to reset it), the USB stack
ofcourse
detects the disconnect, and at that moment 64B more of data are
supplied:

Jul 16 19:42:31 driewerf kernel: hub.c: port 1 connection change 
Jul 16 19:42:31 driewerf kernel: hub.c: port 1, portstatus 100, change
3, 12 Mb/s 
Jul 16 19:42:31 driewerf kernel: visor.c: visor_shutdown 
Jul 16 19:42:31 driewerf kernel: visor.c: visor_close - port 1 
Jul 16 19:42:31 driewerf kernel: visor.c: visor_read_bulk_callback -
port 1 
Jul 16 19:42:31 driewerf kernel: visor.c: visor_read_bulk_callback -
length = 64, data = f3 00 
f8 5c ff ff ea 30 00 00 83 6f 68 ff 80 70 01 05 cd 76 00 07 50 06 9b 75
37 70 00 07 f6 aa 70 00
 30 07 73 7c 00 57 50 3d f5 f8 00 7a a8 30 00 00 00 03 ef e7 80 00 01 00
8d 00 00 00 00 99  
Jul 16 19:42:31 driewerf kernel: visor.c: visor_read_bulk_callback -
port 1 
Jul 16 19:42:31 driewerf kernel: visor.c: visor_read_bulk_callback -
nonzero read bulk status r
eceived: -2 

And this is also noted in the hanging coldsync:

want: 1078, got: 896

Now, with this much information, you should be able to locate the
problem in just a snap :)

One more thing I will try, is to find a way to get the naked USB data,
so
that I can verify if the Palm actually sends it, or not. Do you know
where in the kernel that I should look for that?

koen.
This message was sent through the coldsync-hackers mailing list.  To remove
yourself from this mailing list, send a message to majordomo@thedotin.net
with the words "unsubscribe coldsync-hackers" in the message body.  For more
information on Coldsync, send mail to coldsync-hackers-owner@thedotin.net.